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ABSTRACT 


While  deployed  at  sea,  sailors  are  traditionally  provided  much  of  their  education 
at  sea  through  correspondence  and  pace  courses.  But  with  recent  developments  in  the 
Internet  and  videoconferencing,  it  is  now  feasible  to  deliver  real-time  educational 
material  anjrwhere,  even  to  a  ship  at  sea.  This  thesis  investigates  the  current  status  of 
networked  desktop  videoconferencing  technology,  and  its  use  in  support  of  Joint  Vision 
2010,  with  respect  to  Distance  Learning.  It  provides  an  analysis  of  videoconferencing 
protocols,  standards,  and  applications,  as  well  as  a  videoconferencing  pilot  project.  The 
objective  of  the  analysis  is  to  determine  the  viability  and  economical  benefits  of  using 
videoconferencing  technology  and  collaboration  tools,  from  the  desktop,  as  a  means  for 
simultaneously  delivering  synchronous  and  asynchronous  distance  learning  material  from 
an  academic  location  to  multiple  students  at  remote  locations.  The  results  show  that 
desktop  videoconferencing  technology,  via  IP  based  networks  in  the  Defense  Information 
Infrastructure,  is  a  viable  tool  that  can  add  numerous  economical  benefits,  such  as  a 
decreased  spending  for  travel  and  eliminating  the  need  to  rely  on  large,  room-based 
systems. 
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I.  INTRODUCTION 


A.  INTRODUCTION 

This  thesis  investigates  the  current  status  of  networked  desktop  videoconferencing 
technology,  and  its  use  in  support  of  Joint  Vision  2010,  with  respect  to  Distance  Learning. 
It  provides  an  analysis  of  videoconferencing  protocols,  standards,  and  applications,  as  well 
as  a  videoconferencing  pilot  project.  It  also  follows  work  from  the  thesis  “Internetworking: 
Economical  Storage  and  Retrieval  of  Digital  Audio  and  Video  for  distance  leamingfjiddy, 
96]. 


B,  MOTIVATION 

DoD  has  implemented  various  videoconferencing  systems  in  order  to  make  distance 
learning  more  available,  but  there  are  still  major  obstacles. 

The  current  systems  that  have  been  put  into  place  are  usually  based  upon  a  model 
using  a  dedicated  room  or  roll-about  system,  with  proprietary  hardware  and  software.  Also, 
users  are  still  required  to  travel  to  the  room-based  systems  in  order  to  participate  in  the 
training  sessions.  Surveys  of  room  videoconferencing  system  users  have  identified  desired 
features  such  as  shared  drawing  area,  the  ability  to  connect  to  multiple  sites,  and  ways  to 
incorporate  computer  applications  into  the  conference  [Retinger,  95].  Since  there  can  be  a 
large  geographical  dispersion  of  military  personnel  across  numerous  time  zones,  there  is 
also  the  problem  of  coordination  of  class  times  between  the  instructor  and  the  student. 
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Using  desktops  to  deliver  videoconferencing  has  multiple  advantages:  As  users 
become  more  familiar  with  the  use  of  PCs,  they  will  not  need  to  learn  how  to  provide 
instruction  using  a  room  based  system,  which  usually  requires  a  dedicated  person  to  mange 
the  equipment.  The  instructor  does  not  have  to  deal  with  scheduling  blocks  of  time  to  use 
the  room-based  systems.  Conferencing  over  the  desktop  can  be  more  relaxed  and 
impromptu,  contributing  to  better  human  interaction.  Most  desktop  videoconferencing 
software  has  whiteboard  capabilities,  allowing  the  student  and  instructor  to  share  data  in 
real-time. 


C.  OBJECTIVE  OF  THESIS 

The  primary  objective  of  this  thesis  is  to  describe  how  desktop  videoconferencing 
technology  and  collaboration  tools  can  be  used  either  synchronously  or  asynchronously  to 
deliver  Distance  Learning  content  over  an  IP  based  network  to  multiple  students  at  remote 
locations.  Instructors  might  be  a  Chief  Petty  Officer  (CPO)  at  Fleet  Training  Center 
Pacific,  an  Admiral  in  Washington  D.C.,  or  a  professor  at  the  Naval  Postgraduate  School 
(NPS).  The  topics  of  desktop  videoconferencing  in  regard  to  human/computer  interaction 
aspect  and  social  issues  will  not  be  discussed  here,  but  can  be  found  in  [Rettinger,  95].  Test 
and  evaluation  of  a  prototype  system  at  NPS  provides  an  example  demonstration  how 
distance  learning  can  be  achieved  via  the  PC  to  any  remote  user’s  desktop.  Specifically,  the 
research  and  experiments  for  this  thesis  were  designed  to  collect  data  to  address  the 
following  research  questions: 
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•  How  can  we  leverage  the  Defense  Information  Systems  Network  (DISN)  to 
implement  desktop  videoconferencing  distance  learning  to  the  sea? 

•  What  are  some  of  the  current  protocols  and  standards  available  in  order  to 
multicast  desktop  videoconferencing  applications  via  an  IP  based  network? 

•  How  can  we  leverage  the  Navy’s  current  JMCOMMS/ADNS  program  to 
implement  desktop  videoconferencing  distance  learning  to  a  sWpboard  LAN  at 
sea? 

•  What  are  the  technical  and  management  concerns  in  order  multicast 
videoconferencing  applications  to  the  user  at  sea? 

•  What  impact  will  multicasting  video  over  DISN  have  on  the  system 
bandwidth/availability? 

•  What  are  the  hardware  and  software  requirements  for  the  instnictor  and  student, 
in  order  to  maintain  reliable  communications  throughout  a 

course  of  instruction? 

•  What  are  some  of  the  available  videoconferencing  applications 
that  can  be  used  for  distance  learning? 

•  How  much  will  desktop  videoconferencing  (distance  learning)  offset  travel 
expenses  for  resident  education? 


Preliminary  results  are  evaluated  for  each  of  these  questions. 


D.  SCOPE  OFTHE  THESIS 

The  scope  of  this  thesis  includes:  (1)  Show  how  multicasting  across  IP-based 
networks  can  be  used  to  deliver  desktop  videoconferencing  distance  learning  to  sea.  (2) 
Review  some  of  the  currently  available  videoconferencing  products  and  how  their  use  can 
be  leveraged  for  distance  learning,  (3)  Using  a  prototype,  test  and  evaluate  the  feasibility  of 
the  delivery  and  storage  of  videoconferencing  data  over  an  IP  based  architecture  to-sea.  The 
goal  is  to  evaluate  and  determine  the  economical  and  technical  benefits  of  using  currently 


3 


available  desktop  videoconferencing  applications  (versus  cart  and  room-based  systems)  as 
an  alternative  tool  that  an  instructor  and  student  can  use  to  exchange  course  material  over  an 
IP-based  Internet  and  DISN. 

The  demonstration  incorporates  desktop  workstations  with  cameras,  video  capture 
card,  audio  card,  and  a  network  connection  to  IP  multicast  capable  routers.  Besides  the 
standard  Internet  protocols  normally  found  on  current  desktop  computers,  it  also  contains 
videoconferencing  applications  capable  of  multicasting  video  and  audio,  either 
synchronously  or  asynchronously,  to  naval  students  at  remote  locations  and  at  sea. 


E.  METHODOLOGY 

The  methodology  used  to  produce  this  thesis  included  the  following  tasks: 

•  Conduct  a  literature  search  of  books,  magazines,  articles,  Internet  resources  and 
other  library  information  services  describing  videoconferencing  technology  and 
current  software/hardware  that  can  be  applied  to  distance  learning  in  the  military. 

•  Conduct  a  search  of  books,  magazines,  articles,  Internet  resources,  and  consult 
with  companies  to  determine  the  current  videoconferencing  software  and 
hardware  that  are  best  suited  for  Intemet-to-the-sea  videoconferencing. 

•  Develop  a  model  to  demonstrate  how  distance  learning  courses  can  be 
seamlessly  transported  from  the  instructor  to  the  Internet  and  the  Navy’s 
conununication  networks  infrastructure,  in  order  to  provide  Intemet-to-sea 
videoconferencing. 
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•  Develop  a  prototype  videoconferencing  system  that  might  be  used  as  a  part  of  a 
“toolbox”  that  can  be  used  export  a  correspondence  course  or  graduate  school 
class  to  a  ship. 

•  Consult  with  the  Space  and  Naval  Warfare  Systems  Conunand  (SPARWAR)  and 
the  Research,  Testing  and  Evaluation  Division  of  the  Naval  Command  Control 
and  Ocean  Surveillance  Center  (NRAD)  on  current  developments  of  the  Joint 
Maritime  Communications  System/ Automated  Digital  Network  System 
(JMCOMMS/ADNS)  and  its  current  use  with  videoconferencing  technology. 


F.  THESIS  ORGANIZATION 

This  thesis  is  composed  of  eight  chapters.  This  chapter  provides  the  motivation, 
objectives,  research  questions,  scope  and  methodology  employed  to  conduct  the  research. 
Chapter  n  provides  the  history  of  videoconferencing,  and  related  work.  Chapter  HI 
discusses  the  current  video  and  audio  compression  protocols  and  standards  that  are  required 
for  current  videoconferencing  systems.  Chapter  IV  describes  the  various  multicasting 
protocols  and  standards  necessary  to  provide  scalability,  cross-platform  support  and  quality 
of  service  (QoS)  necessary  to  provide  distance  learning  from  the  desktop  over  the 
commercial  and  naval  IP  based  networks.  Chapter  V  describes  various  options  that  can  be 
applied  over  the  DISN  architecture  that  will  support  IP  based  desktop  videoconferencing  to 
sea.  Chapter  VI  compares  some  of  the  desktop  videoconferencing  applications  and 
protocols  required  to  deliver  distance  education  to  sea.  Chapter  Vn  discusses  the 
demonstration  project  and  findings.  Chapter  VUI  provides  the  conclusion,  .summary  and 
recommendation  for  future  research. 
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n.  RELATED  WORK 


A.  INTRODUCTION 

This  chapter  provides  a  brief  history  of  videoconferencing  and  the  traditional 
methods  used  to  provide  distance  learning  to  personnel  in  remote  locations.  It  gives  a  brief 
overview  of  the  various  methods  that  can  be  employed  to  deliver  distance  learning  across  a 
network  (WAN).  Finally,  it  describes  some  of  the  current  VTC/videoconferencing 
solutions  used  in  the  Navy  and  DoD. 


B.  BRIEF  HISTORY  OF  VIDEOTELECONFERENCING 

Videoconferencing  was  first  introduced  in  1926  when  AT«&T’s  President,  Walter  S. 
Gifford,  used  Video  Teleconferencing  to  speak  with  the  Secretary  of  Commerce,  Herbert 
Hoover.  [Nerino,  94]  Not  until  the  late  forties  and  early  fifties,  with  the  advent  of  the 
television,  did  the  next  major  breakthrough  in  video  technology  come  about.  After 
television,  videoconferencing  did  not  see  another  major  breakthrough  until  AT&T 
introduced  its  picture  telephone  at  the  1964  New  York  World’s  Fair.  Even  then,  because 
videoconferencing  contained  frequencies  that  were  beyond  those  used  by  telephone 
networks  at  that  time,  expensive  satellites  were  used  to  provide  the  medium  needed  for  high 
bandwidths  required  for  videoconferencing.  By  1983,  full-bandwidth  satellite  transmissions 
still  cost  over  $1  million  per  year  [Nerino,  94].  Today  such  satellite  links  are  becoming 
more  affordable. 
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As  the  1970’s  progressed,  new  advances  in  computing  power  and  improved  methods 
for  converting  analog  signals  to  digital  formats  resulted  in  telephone  service  providers 
transitioning  to  digital  transmission  methods  to  compliment  the  existing  analog  processing 
systems.  Although  videoconferencing  has  become  more  widely  used  for  services  like 
business  meetings,  collaborative  research,  distance  learning,  etc.,  these  service  are  generally 
performed  over  dedicated  leased  lines  and  usually  requires  expensive  room-based  or  roll- 
about  videoconferencing  systems. 

Today,  due  to  faster  desktop  computers  and  the  rapid  expansion  of  the  World  Wide 
Web  and  the  Internet,  transmitting  real-time  video  using  desktop  computers  to  remote 
locations  has  become  practical.  Although  there  is  currently  an  explosion  in  the  number  of 
applications  that  can  transmit  and  receive  streaming  audio  and  video  to  and  from  a  PC  over 
the  Internet,  there  continues  to  be  significant  interoperability,  protocol  and  architectural 
issues  that  must  be  addressed  if  videoconferencing  is  to  become  commonplace  from  the 
desktop. 


C.  DISTANCE  LEARNING 

1.  Traditional  Educational  Methods 

Educational  development  has  been  always  been  required  in  the  career  progression  of 
naval  personnel.  This  training  is  essential  to  achieving  and  maintaining  national  security, 
as  well  as  national  strategic  objectives  [Emswiler,  1995].  Traditionally,  the  primary 
methods  of  providing  the  necessary  education  to  Naval  personnel  has  been  through  the 
following  methods: 
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•  Short-temi  temporary  duty  seminars 

•  Resident  education  at  technical  schools  (A,  B,  and  C  schools) 

•  Resident  education  at  undergraduate  or  graduate  educational  institutions, 
e.g.  Naval  Postgraduate  School  (NPS)  or  Naval  War  College. 

•  Postal-based  correspondence  courses  by  postal  mail. 

Courses  that  require  travel  on  a  TAD  basis  are  useful  for  initial  or  technical  refresher 
type  training  nevertheless,  this  approach  is  costly  and  requires  travel  by  the  instructor, 
student  or  both. 

Resident  education  at  NPS  requires  students  to  stay  away  from  the  operational  forces 
for  two  years,  on  average.  Although  many  courses  require  the  student  to  be  present  to 
obtain  the  desired  educational  benefit,  others  can  be  easily  and  readily  exported  to  sea  or  a 
remote  shore  location. 

Traditionally,  postal-based  correspondence  courses  have  been  necessary  due  to  the 
remote  locations  that  naval  personnel  are  often  stationed.  If  the  course  is  the  equivalent  to  a 
resident  course,  however,  management  of  the  correspondence  course  will  be  substantial.  In 
order  for  the  correspondence  course  to  be  successful,  not  only  must  there  be  a  sustained 
commitment  from  the  student,  but  the  feedback  loop  to  the  student  must  be  amenable  to 
continuing,  timely  instruction.  Often  such  a  feedback  loop  is  not  the  case,  as  sometimes  it 
may  be  weeks,  due  to  numerous  reasons,  before  the  student  receives  feedback  or  new 
modules.  As  a  result  many  students  do  not  finish. 
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2. 


The  Value  of  Distance  Learning  in  the  Navy 


Distance  learning  in  the  Navy  can  be  beneficial  in  two  important  areas;  cost  and 
global  reach.  In  a  decreasing  defense  budget,  the  allocation  of  MILPERS  dollars,  which 
pay  for  travel  and  education,  is  ever  decreasing.  Besides  costs,  the  naval  environment 
requires  personnel  to  be  deployed  at  remote  or  isolated  settings  that  are  far  from  traditional 
educational  resources.  A  more  time  efficient  delivery  of  course  material  and  feedback  to 
the  student  can  markedly  improve  the  dedication  of  the  student  to  complete  the  course  of 
instruction.  Figure  2-1  outlines  the  general  situations  when  distance  learning  can  be 
advantageous  to  traditional  methods. 

•  Target  audience  is  widely  scattered  and  it  is  not  cost  effective  or  possible  to  have 
them  travel  to  a  central  training  location. 

•  Content  or  consistency  in  delivery  is  so  critical  that  it  must  be  carefully  controlled 
for  accuracy  or  correct  interpretation. 

•  Content  is  too  dangerous  for  novices  to  participate  in  and  distance  education  will 
allow  for  familiarization  and  confidence  budding  prior  to  the  actual  situation. 

•  Scheduling  difficulties  arise  because  the  student  cannot  take  extended  time  from 
other  critical  missions  to  attend  a  normally  conducted  training  program. 

•  The  expense  of  conducting  live  training  is  cost  prohibitive. 

•  There  are  a  limited  number  of  qualified  trainers. 

Figure  2-1  Productive  applications  of  a  distance  education  approach  [Biggs,  94] 


3.  Distance  Learning  via  the  Internet 

The  World  Wide  Web  (WWW)  provides  a  means  of  providing  both  time-efficient 
course  material  and  research  tools.  Distance  education  can  be  as  a  simple  as  a 
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correspondence  course  offered  through  electronic  mail,  something  as  complex  as  interactive 
video  teleconferencing  over  the  Internet,  or  combinations  of  both  [Tiddy,  96]. 

As  more  ships,  commands,  and  individual  units  become  connected  to  local  area 
networks  (LAN’s)  and  wide-area  networks  (WAN’s),  distance  learning  programs  can  be 
more  easily  implemented,  ultimately  providing  more  economical  resources  for  training 
[Emswiler,  95].  Also,  as  video/audio  application  and  transport  protocols  and  standards 
become  more  established,  commercially  produced  products  become  more  readily  available 
to  furnish  the  tools  necessary  to  provide  distance  learning  over  commercial  and  Department 
of  Defense  (DoD)  networks.  To  date,  however,  the  growth  of  Internet  and  DoD  network 
applications  and  users  are  outpacing  growth  of  bandwidth.  With  limited  dollars  for 
education  and  travel,  DoD  can  not  wait  until  this  trend  reverses  itself.  Therefore  it  is  critical 
to  use  well-developed  standards  and  protocols,  i.e.  multicasting,  compression,  etc.,  along 
with  existing  network  infrastructures,  in  order  to  get  the  most  efficient  delivery  to  remote 
users. 

4.  Videoconferecing  in  the  Departnieiit  of  Defense 

DoD  has  used  videoconferencing  technology  in  a  wide  variety  of  applications. 
Some  of  the  major  areas  where  this  technology  is  being  used  is  in; 

•  Training 

•  Telemedicine 

•  Group  Conferences/Meetings 

•  Crisis  Response 

Videoconferencing  technology  has  started  to  bring  significant  savings  to  DoD, 
mainly  in  travel  expenses.  The  need  for  military  personnel  to  travel  to  attend  meetings. 
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conferences,  training,  and  exercises  has  been  greatly  reduced  for  commands  that  have  access 
to  videoconferencing  equipment.  The  following  examples  contain  more  specific 
descriptions  of  areas  where  videoconferencing  technology  is  being  or  has  been  applied  in 
DoD: 

a.)  Training:  NPS  Distance  Learning  via  the  Multicast  Backbone  (MBone) 
system:  NPS  has  conducted  “Distance  Learning”  or  remote  classroom  instruction,  through 
the  use  of  videoconferencing  technology  over  the  MBone.  In  a  1995  thesis  by  Tracy 
Emswiler,  it  was  demonstrated  that  videoconferencing  technology  could  be  an  economically 
feasible  approach  to  distance  learning.  It  documented  Dr.  Richard  Hamming’s  course, 
“Learning  to  Learn”,  being  transmitted  worldwide  over  the  MBone  for  an  entire  quarter, 
[Emswiler,  95] 

NPS  is  also  currently  delivering  distance  learning  in  Root  Hall,  using  a  PictureTel 
4000  Video  conferencing  Systems  over  Integrated  Services  Digital  Network,  Basic  Rate 
Interface  (ISDN  BRI)  lines.  Courses,  and  even  some  degree  programs,  are  offered  in 
Computer  Science,  Electrical  Engineering,  Aerospace  Engineering  and  Information 
Technology  Management. 

The  Chief  of  Naval  Education  and  Training  (CNET)  Electronic  Schoolhouse 
Network  (CESN)  is  a  two-way  video  and  audio  multipoint,  secure  distance  learning 
network.  It  allows  simultaneous  instruction  to  multiple  shore  and  shipboard  sites,  where 
individuals  can  interact  both  verbally  and  visually  in  a  real-time  mode.  Its  purpose  is  to 
provide  effective  training  to  a  large  number  of  personnel  at  or  near  their  duty  stations, 
eliminating  the  need  for  travel  to  distant  schoolhouses,  thereby  reducing  travel  and  per  diem 
costs. 
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The  Navy’s  Video  Tele-training  (VTT)  CESN  is  linked  via  land  lines  and  operates  at 
a  fractional  T-1  data  rate  of  384  Kbps.  Communication  is  provided  through  the 
government’s  long-haul  communications  network  using  FTS2000.  Satellite  capability  is 
available  for  shipboard  VTT.  The  network  is  made  up  of  16  sites  nationwide  and  includes  a 
site  on  board  the  USS  George  Washington  [CNET,  97]. 

b. )  Telemedicine:  This  is  a  field  where  videoconferencing  is  making  significant 

inroads.  Basically,  the  same  idea  from  distance  learning  is  applied  to  telemedicine:  A  central 
care  facility  with  medical  expertise  (i.  e.  physicians,  surgical  staff,  etc.)  can  provide  care 
“remotely”  to  a  distant  site  via  videoconferencing.  A  huge  potential  for  this  technology 
exists  in  afloat  applications,  since  most  U.  S.  Navy  and  Coast  Guard  ships  have  medical 
personnel  who  can  provide  a  only  a  basic  level  of  care.  One  practical  use  was  demonstrated 
when  Telemedicine  was  in  used  on  the  USS  George  Washington  (CVN  73)  to  provide 
mental  health  examinations,  during  a  1997  deployment.  Psychiatrists  successfully  evaluated 
onboard  patients,  capturing  their  mood,  body  language  and  response  to  questions  [Koenig, 
97].  Additionally,  during  JWID  97,  the  Naval  Medical  Information  Management  Center 
(NMIMC),  Bethesda,  Maryland  sponsored  a  demonstration  of  telemedicine  technologies 
aboard  the  submarine  USS  Atlanta  (SSN  712)  in  Norfolk,  Virginia.  Once  ships  are  routinely 
outfitted  with  this  technology,  a  tremendous  benefit  in  Telemedicine  will  surely  be  realized. 

c. )  Group  Conferencing:  In  September  1995,  a  major  Joint  Task  Force  (JTF) 
Exercise  was  conducted  in  Panama:  Exercise  “Fuertes  Defensas”  (Strong  Defense).  Led  by 
the  Commander,  18th  Airborne  Corps,  this  exercise  was  conducted  to  test  United  States 
readiness  to  support  and  defend  the  Panama  Canal.  Each  day,  the  JTF  Commander  (an 
Army  LGEN)  was  able  to  keep  advised  of  exercise  progress  by  conducting  a  morning 
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Videoconference  with  his  Army,  Navy,  Air  Force,  and  Marine  Component  Commanders. 
These  commanders  were  sometimes  physically  separated  by  hundreds  of  miles.  Because  of 
videoconferencing  technology,  the  commander  was  able  to  both  remain  well  informed  of 
exercise  progress,  and  also  was  able  to  promulgate  his  own  directives  and  intentions  for  the 
day. 

d.)  Crisis  Response:  There  is  a  huge  potential  for  further  use  of 
videoconferencing  technology  for  Crisis  Response  Management.  For  example.  Navy  and 
Marine  Corps  Afloat  and  Expeditionary  Commanders  might  receive  real-time  combat 
instructions  from  their  superiors  via  videoconferencing.  Also,  these  Task  Force 
Commanders  might  promulgate  their  own  guidance  to  their  attached  ships  and  elements  in 
the  same  fashion,  all  the  way  down  the  chain  of  command.  There  is  also  a  large  potential 
for  this  technology  in  non-combat  crisis  management  situations,  such  as  humanitarian 
disaster  relief  operations. 
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ra.  MAJOR  VIDEOCONFERENCING  STANDARDS 


A.  INTRODUCTION 


This  chapter  will  discuss  the  major  videoconferencing  standards,  as  they  are 
significant  issues  when  implementing  distance  learning  to  sea  from  the  desktop. 


B.  BACKGROUND  DSfFROMATION 


The  International  Telecommunications  Union  (ITU),  a  body  of  the  United  Nations 
that  focuses  on  developing  standards,  tasks  the  Telecommunications  Standardization  Sector 
(ITU-T)  with  developing  telephony  standards.  It  develops  some  of  the  major  protocols  that 
are  used  by  IP-based  videoconferencing  systems  today,  such  as  H.320,  H.323,  and  H.324. 
Table  3-1  provides  an  overview  of  those  standards. 


Standard 

Description 

Remarks 

H.320 

H.320  is  an  "umbrella" 
standard  that  covers  audio, 
video,  videoconferencing, 
graphics,  and  multicasting 

Mandatory  standard  by  the 
Federal  Government  in  1993. 

H.323 

Visual  (audiovisual) 
communications  over  LANs 

Addresses  audiovisual 
communications  across  LANs 
and  gateways  that  connect 

LANs  to  the  Internet. 

H.324 

Defines  a  multimedia 
communication  terminal 
operating  over  the  Switched 
Telephone  Network.  It 
includes  H.261,  T.120,  and 
V.34. 

Incorporates  the  most  common 
global  conununications  facility 
today  -  (POTS) 

Table  3-1 :  ITU-T  Videoconferencing  Standards 
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The  videoconferencing  systems  and  standards  described  above  can  be  viewed  to 
have  evolved  over  three  generations.  The  1®*  generation  systems  were  generally  point-to- 
point,  proprietary  systems  that  usually  required  dedicated  T-1  (1.5Mbps)  networks  or  better. 
Videoconferencing  coding  and  compression  was  usually  done  by  hardware 
compressors/decompressors  (codecs).  There  were  not  many  standards  initially  because 
interoperability  of  the  various  systems  was  not  perceived  as  an  issue.  I"**  Generation 
systems  were  driven  by  Integrated  Services  Digital  Network  (ISDN).  The  compression  was 
also  usually  done  by  proprietary,  hardware  codecs.  As  the  technology  matured,  and 
compatibility  became  more  of  an  issue,  videoconferencing  application  developers  began  to 
adopt  universal  standards,  ultimately  migrating  towards  ITU-T’s  H.320  protocol.  Also, 
ISDN's  inability  to  scale  to  a  large  number  of  users  limited  its  acceptance.  Today,  as 
network-centric  computing  has  migrated  to  the  core  of  many  organizations,  compatibility 
has  become  a  focal  point  in  the  development  of  videoconferencing  systems,  thus  bringing 
about  3^*”  generation  system  protocols.  These  new  standards  are  generally  designed  to  match 
the  ISO  seven-layer  reference  model.  Now,  advances  in  modeling  and  simulation  (such  as 
MPEG-4  compression),  and  improved  scalability  due  to  multicasting,  4*  generation 
standards  are  coming  about. 


C.  1^  GENERATION  STANDARDS 

generation  videoconferencing  systems  are  usually  large  room-based  systems  that 
are  connected  via  dedicated  circuit  switched  or  T1  connections.  These  systems  are  point-to- 
point,  and  use  proprietary  system  standards  to  deliver  and  receive  content.  Additionally 
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they  are  not  very  scalable,  and  many  of  the  international  standards-based  systems  used  today 
are  not  are  not  backwards  compatible  with  them.  Therefore  they  would  be  not  be  feasible 
for  providing  IP  based  distance  learning  to  sea. 


D.  2^  GENERATION  STANDARDS 

1.  H320 

H.320  -  "Narrow-Band  Visual  Telephone  Systems  and  Terminal  Equipment"  is  the 
umbrella  standard  that  covers  audio,  video,  videoconferencing,  graphics  and  multicasting. 
rrU-T  recommends  it  as  the  minimum  standard  that  will  ensure  that  videoconferencing 
systems  will  communicate  with  each  other.  H.320  covers  a  family  of  standards  that  governs 
videoconferencing  systems  that  use  coder/decoders  (codecs)  between  64  Kbps  to  1920Kbps 
(64Kbps  X  30).  It  became  the  mandatory  standard  for  the  Federal  Government  in  1993 
[Nerino,  94]. 

The  difference  between  the  various  videoconferencing  systems  will  depend  upon  the 
optional  requirements  that  each  can  support,  which  will  ultimately  effect  the  quality  of  the 
audio  and  video.  How  well  the  features  are  implemented  is  left  up  the  each  manufacturer. 
Table  3-2  shows  H.320  recommendations  and  their  titles. 
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Video  Codec 

H.261:  Video  Codec  for  audiovisual 
services  at  p  x  64 

Audio  Codec 

G.711:  Pulse  Code  Modulator  (PCM)  of 
Voice  frequencies 

G.722:  7  Khz  audio-coding  with  64  Kbps 
G.728:  Coding  of  speech  at  16  Kbps  using 
low  delay  code  excited  linear  prediction 

Frame  Structure 

H.221:  Frame  structure  for  a  64  to 
1920Kbps  in  audiovisual  teleservices 

Control  and  Indication 

H.230:  Frame-synchronous  control  and 
indication  signals  for  audiovisual  systems 

Communication  Procedure 

H.242;  System  for  establishing 

communication  between  audiovisual 
terminals  using  digital  channels  up  to 
2Mbps 

Table  3*2:  H.320  Recommendations  [Nerino,  94] 


H.320  only  requires  vendors  to  support  the  minimum  standards.  When  deciding 
between  systems,  there  are  currently  three  classes  of  videoconferencing  systems: 

Class  1  -  minimum  level  of  support 

Class  2  -  Class  1  +  support  of  some  optional  features 

Class  3  -  Class  1  +  all  optional  features  [VTEL,  95] 

The  major  factors  that  affect  system  quality  are  picture  resolution,  frame  rate, 
preprocessing  and  postprocessing,  motion  compensation,  audio,  data  rate  and  quality. 


a.  Picture  Resolt^n 

Picture  Resolution  is  the  frame  format  of  the  video  picture.  The  National 
Television  Systems  Committee  (NTSC)  standard  picture  frame  consists  of  780  horizontal 
picture  elements  (pixels)  and  480  active  vertical  lines.  Due  to  bandwidth  constraints  of  the 
standard  videoconferencing  channels  used  today,  that  picture  size  is  not  practical  for  current 
videoconferencing  systems.  H.320  uses  quarter  common  intermediate  format  (QCIF)  -  176 
X  144  pixel  resolution,  and  common  intermediate  format  (CIF)  -  352  X  288  pixel 
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resolution.  If  there  is  a  connection  between  different  classes  of  picture  resolution,  systems 
negotiate  a  resolution  to  the  lowest  one. 

b.  Frame  Rate 

H.320  can  support  frame  rates  of  7.5, 10,  15,  and  30  frame  per  seconds  (fps). 
Class  1  systems  can  support  a  frame  rate  of  7.5  fps;  Class  2,  typically  about  15  fps,  using 
QCIF;  and  class  3  supports  30  fps,  using  CIF.  Frame  rate  negotiation  uses  the  lower  class 
when  two  or  more  classes  are  used.  [VTEL,  95] 

e.  Preprocessing  and  Pos^rocessing 

Preprocessing  reduces  the  amount  of  re-coding  in  the  background.  If  there  is 
poor  camera  lighting,  video  “noise”  can  make  the  system  think  that  there  is  motion  in  the 
background  when  in  fact  there  is  none.  Preprocessing  prevents  the  video  encoder  from 
wasting  time  encoding  “noise”  caused  by  the  poor  lighting,  ultimately  ensuring  that  only 
real  motion  gets  encoded  [VTEL,  95]. 

Postprocessing  compensates  for  the  picture  degradation  due  to  fast  motion.  It 
can  help  reduce  the  “blocking”  and  noisy  effects  caused  by  video  codecs  (discussed  in  more 
detail  under  H.261).  Postprocessing  is  also  can  be  used  to  enhance  the  frame  rate,  thus 
reducing  jerky  motion  [VTEL,  95]. 

d.  Motion  Compensation 

Motion  Compensation  is  another  video  quality  enhancement.  There  are  two 
aspects  of  motion  compensation:  motion  estimation  and  actual  motion  compensation. 
Motion  estimation  is  performed  at  the  video  encoder  to  determine  the  motion  vector  of  the 
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subject.  Motion  compensation  is  performed  at  both  encoder  and  decoder.  It  consists  of 
moving  blocks  of  video  data  around  based  on  the  motion  vector  determined  during  motion 
estimation.  Especially  important  at  lower  bit  rates,  motion  compensation  moves  only  the 
encoded  section  of  video  where  motion  has  occurred  rather  than  the  entire  video  area  of 
each  frame.  All  H.320  systems  have  the  ability  to  decode  a  motion  compensation  signal. 
Providing  encoded  motion  compensation  (where  the  real  video  quality  improvements  are 
made)  is  optional  [VTEL,  95]. 

Although  the  aforementioned  factors  affect  H.320  system  quality,  many  other 
elements  also  affect  quality.  Table  3-3  provides  a  summary  of  H.320  compliance. 


Level  1 
(Minimum) 

Level  2 
(Medium) 

Level  3 
(High) 

Frame  Format 
(Pixels) 

QCIF 

(176  X  144) 

CIF 

(352  X  288) 

CIF 

(352  X  288) 

Frame  Speed 
(frames/sec) 

5 

Up  to  15 

Up  to  30 

Data  Rate 

56  /  64  Kbps 

Up  to 

384  Kbps 

Up  to 

1.544  Mbps 

Motion 

Compensation 

No 

Limited 
(6X6  =  36) 

Full  Motion 
(30X30  =  900) 

Pre  and  post 

processing  on  both 
encoder  and  decoder 

Not 

Applicable 

Not 

Applicable 

Pre  and  post 
processing  on  both 
encoder  and  decoder 

Table  3-3:  Levels  of  H.320  compliance  [Neri 

[no,  94] 

Z  HJ20  and  ISDN 

ISDN  is  a  connection-oriented  circuit-switched  digital  communication  service  that  is 
provided  by  telephone  companies  aiid  network  providers.  It  provides  end-to-end  digital 
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connectivity  between  local  area  networks  (LANs).  ISDN  connects  users  to  LANs  and  ca 
also  connect  LANs  to  widearea  networks  (WANs).  The  basic  ISDN  connection  bandwidth 
is  128  kbps,  split  among  two  bearer  (video,  audio)  channels  at  64  kbps  each.  There  is  an 
additional  16kbps  data  channel  that  provides  connectivity  data. 

Implementation  of  ISDN  channels  is  fairly  flexible.  Telephone  companies  provide 
services  that  allow  ISDN  channels  to  split  (i.e.  64kbps  channel  split  into  two  32kbps 
channels  to  provide  low-fidelity  digitized  voice),  or  bonded  together.  Bonding  is 
accomplished  by  creating  one  logical  channel  out  of  multiple  virtual  channels.  For  example, 
the  Navy’s  Video  Information  Exchange  System  (VKS)  uses  bonding  to  provide  bandwidth 
of  1 12-384kbps  in  order  to  allow  afloat  and  ashore  nodes  to  conduct  face-to-face  meetings 
in  real-time. 

ISDN  offers  improved  videoconferencing  connectivity  over  dedicated,  point-to-point 
systems,  because  it  works  over  existing  phone  lines  and  does  not  require  the  installation  of 
an  extensive  network  backbone.  Unfortunately  some  major  reasons  remain  why  ISDN  is 
not  a  good  long-range  alternative  for  distance  learning.  One  is  the  lack  of  access  to  remote 
users  in  a  globally  dispersed  military  environment.  Also,  in  order  to  multicast,  you  must 
deal  with  how  the  end  points  are  going  to  be  handled,  i.e.  adding  multipoint  control  units 
(MCUs).  Finally,  continuing  to  implement  ISDN  as  the  primary  videoconferencing  long- 
haul  architecture  in  the  Navy  is  at  odds  with  the  Defense  Information  Infrastmcture 
Common  Operating  Environment  (DII  COE)  migration  towards  the  consolidation  of  voice, 
video  and  data  networks. 
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Recent  versions  of  videoconferencing  systems  that  use  ISDN  as  its  transport  medium 
have  begun  to  migrate  to  the  H.320  protocol,  but  many  vendors  still  use  proprietary 
protocols  in  their  videoconferencing  systems.  One  possible  reason: 

although  the  H.320  standard  is  technically  sound,  ISDN  has  had  a  poor  showing  in  the 
marketplace,  consequently  bundling  H.320  with  ISDN  has  inhibited  initial  acceptance  of 
H.320. 


E.  3""  GENERATION  STANDARDS 

1.  Internet  Videoconfermdng 

As  the  Internet  and  client-server  computing  continued  to  grow,  videoconferencing 
systems  for  LANs  and  WANs  began  to  be  developed.  H.323  (an  extension  of  H.320)  covers 
videoconferencing  over  narrow-band  WANs  and  also  over  LANs.  Since  H.323  is  beised 
upon  the  lETF’s  Real-Time  Protocol  (RTP)  -  which  will  be  discussed  in  more  detail  in 
Chapter  IV  --  it  can  be  applied  to  streaming  video  over  packet-switched  networks  such  as 
the  Internet.  H.323  also  applies  to  point-to-point  and  multipoint  sessions.  Some  of  the 
other  corhponents  of  H.323  include: 

•  Specifying  messages  for  call  control  including  signaling,  registration  and 
admissions,  and  packetization/  synchronization  of  media  streams. 

•  Specifying  messages  for  opening  and  closing  channels  for  media  streams,  and 
other  commands,  requests  and  indications. 

•  H.261  (video  codecs) 

•  H.263  “  Specifies  a  new  video  codec  for  video  over  POTS  (<  64Kbps). 
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•  G.71 1,  G.722,  G.728  and  G.729  standards 


•  H.230  Frame  Synchronous  Control  Standards 

•  H.245  Link  Control  Standards 

•  T.  120  Data  Sharing  Standard 

2.  Plain  Old  Tdeveision  System  (POTS) 

POTS  is  the  acronym  for  Plain  Old  Telephone  Service.  It  utilizes  the  existing 
infrastmcture  of  telephone  lines  and  was  designed  to  address  the  need  for  an  inexpensive, 
high-quality  solution  for  video  conferencing  over  the  existing  infrastructure.  The  H.324 
standard  addresses  high  quality  video  and  audio  compression  over  POTS  modem 
connections.  Specifically  it  addresses  and  specifies  a  common  method  for  sharing  video, 
data,  and  voice  simultaneously  using  high-speed  (V.34)  modem  connections  over  a  single 
POTS  telephone  line. 

Video  conferencing  over  POTS  has  been  the  least  attractive  of  the  medium  options 
due  to  the  bandwidth  constraints.  However,  because  H.324  incorporates  the  most  common 
global  communications  facility  today,  POTS  currently  has  a  broad  impact  on  the  current 
marketplace.  Even  though  the  actual  bandwidth  of  POTS  hasnt  grown  much,  it  is  still 
becoming  less  of  an  obstacle,  since  today’s  modem  technology  and  data  compression  make 
it  technically  feasible  to  transmit  both  very  low  frame  rate  video  and  voice  over  a  single 
line.  As  processors  have  become  more  capable,  codec  functions  are  now  performed 
primarily  in  software,  often  achieving  full-color,  15  frames  per  second  (under  optimal 
conditions),  full  duplex  video  and  audio,  with  real-time  responsiveness.  Some  of  the  major 
components  of  H.324  are: 
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•  H.263  —  Defines  speech  coding  at  rates  less  than  64  Kbps. 

•  H.261  -  Video  Compression  from  64  to  2Mbps 

•  H.223  -  Defines  a  Multiplexing  protocol  for  low  bit  rate  multimedia  terminals. 

•  H.245  —  Defines  control  of  communications  between  multimedia  terminals. 

•  G.723  —  Defines  speech  coding  for  multimedia  telecommunications  transmitting 
at  5.3/6.3  Kbps. 


R  VIDEO  COMPRESSION 

In  the  past,  due  to  the  bandwidth  constraints  of  terrestrial  mediums,  satellite  was  the 
traditional  and  reliable  method  for  transporting  videoconferencing  between  users.  Due  to 
technical  improvements  in  routing  and  switching,  however,  optimal  high-quality 
videoconferencing  can  also  be  realized  with  dedicated  circuit-switched  channels. 
Unfortunately,  due  the  high  cost  and  lack  of  widespread  availability  of  these  channels,  most 
desktop  computer  users  do  not  have  access  to  a  dedicated  videoconferencing  link  that  can 
transfer  data  at  the  necessarily  data  rates.  The  chief  digital  transportation  medium  that  the 
average  computer  user  has  access  to  is  the  Internet,  which  is  based  upon  a  non-guaranteed 
bandwidth,  packet-switching  technology  often  connected  to  an  end  user  via  POTS.  Even  as 
more  capable  routers,  switches  and  modems  are  used  to  deliver  videoconferencing, 
providing  coherent  end-to-end  video  and  audio  streams  across  the  Internet  remains  a  major 
obstacle,  due  to  lack  of  guaranteed  bandwidth.  Video  and  audio  quality  can  be  very  poor 
due  to  Internet  congestion,  routing  delay,  packet  loss/retransmission,  packet  constant 
rerouting,  limited  multicasting  capabilities,  and  other  factors. 
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One  way  to  improve  bandwidth  is  to  the  compress  the  data  prior  to  its  traversing  a 
network.  This  can  generally  be  accomplished  using  two  types  of  data  compression  schemes: 
lossless  and  “lossy.”  Lossless  compression  schemes  are  generally  used  in  algorithms  like* 
zip,  gzip  and  gif  file  types.  When  using  these  types  of  algorithms,  no  data  is  lost  during  the 
compression  and  subsequent  decompression  of  the  data  with  approximations.  The  lossy 
compression  algorithms  search  for  and  replace  redundant  data.  Fortunately,  due  to  the 
inability  of  the  human  eye  to  discern  small  losses  of  data  in  a  digital  image  (notably  the  fact 
that  small  color  details  aren’t  perceived  as  well  as  small  details  of  light  and  dark)  lossy 
compression  techniques  are  very  suitable  for  videoconferencing. 

There  are  a  number  of  compression  techniques  available  for  use  in 
videoconferencing,  and  H.261  is  one  of  the  most  widely  used  in  commercial 
videoconferencing  products.  Motion  JPEG,  Indeo,  MPEGl,  and  MPEG2  are  also 
prevalent.  H.261  is  optimized  for  bandwidth  efficiency  and  low  delay,  whereas  MPEG  is 
less  bandwidth  efficient.  MPEG  is  editable  and  provides  the  high  visual  quality  required  by 
movie-type  applications.  Indeo  compression,  offered  by  Intel,  is  optimized  for  low  decode 
processing  requirements. 

In  order  provide  an  appreciation  of  video  compression  algorithms,  an  overview 
H.261  will  be  given.  Audio  compression  will  not  be  discussed  in  detail  since  it  uses  the 
same  basic  principles  used  for  video  compression.  A  good  reference  for  their  details  is 
[Rettinger,  95]. 

The  H.261  is  a  widely  used  international  video  compression  standard  for 
videoconferencing  that  is  designed  for  applications  which  use  synchronous  circuit  switched 
networks  as  their  transmission  channels,  e.g.  ISDN.  It  was  approved  by  the  International 
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Telecommunication  Union  (ITU),  (formerly  CCITT)  in  1990,  and  is  currently  used  in 
conjunction  with  H.320,  H.323  and  H.324.  H.261  is  an  interoperability  standard  that 

pertains  to  communication  between  encoders/decoders  (codecs)  used  by  videoconferencing 
systems.  It  is  often  called  Px64,  where  P  (1-30)  represents  multiples  of  frames  sent  at 
64Kbps.  H.261  is  similar  to  other  “lossy”  compression  standards  like  JPEG,  MJPEG  and 
MPEG.  Although  similar  to  MJPEG  and  MPEG,  JPEG  is  a  compression  standard  used  for 
still  pictures,  whereas  MPEG  and  H.261  deal  with  motion  video.  Motion  JPEG  (MJPEG) 
generally  uses  H.261  techniques,  such  as  Discrete  Cosine  Transform  (DCT)  encoding, 
quantization,  macroblocks,  etc.  Using  “lossy”  compression  algorithms,  H.261  has  provided 
a  major  advantage  in  dealing  with  the  bandwidth  constraints  of  various  transmission  media, 
without  losing  any  significant  picture  quality  (as  least  as  far  as  the  human  eye  is  concerned). 
Although  both  MPEG  and  H.261  handle  motion  pictures,  MPEG  is  designed  to  handle 
compressed  bitstreams  for  the  moving  picture  components  of  audio/visual  services  at  rates 
from  0.9  to  1.5  Mbps.  H.261,  designed  to  target  videoconferencing  applications  where 
motion  is  naturally  limited,  is  specified  from  64  Kbps  to  approximately  2  Mbps. 

Due  to  the  computation-intensive  algorithm  used  in  codecs,  in  the  early 
videoconferencing  systems  they  were  implemented  in  a  separate  piece  of  hardware.  With 
today’s  more  powerful  processors,  however,  the  computations  can  be  done  by  the 
computer’s  onboard  processor. 

H.261  uses  Discrete  Cosine  Transform  (DCT),  to  take  advantage  of  the  intraframe 
spatial  and  interframe  temporal  redundancy  found  in  picture  data.  Spatial  redundancy  keeps 
track  of  the  similarities  in  information  in  the  same  picture  frame.  It  relies  on  a  small  number 
of  bits  to  describe  areas  (pixels)  on  a  picture  that  are  the  same  color,  therefore  eliminating 
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the  need  to  code  each  pixel  for  every  transmission  of  data  across  the  channel.  Temporal 
redundancy,  using  motion  compensation,  takes  advantage  of  similarities  of  information 
between  adjacent  frames  in  a  group  of  moving  pictures,  therefore  only  pixels  that  have 
changed  from  one  frame  to  the  next  are  transmitted.  In  summary,  DCT  gets  rid  of 
redundant  data  bits  in  each  block  of  picture  frame  data. 

H.261  also  takes  advantage  of  limitations  in  the  human  eye.  Even  though  NTSC’s 
standard  for  transmitting  moving  pictures  is  30  frames  per  second,  the  human  eye  can  only 
discern  movement  up  to  about  24  frames  per  second.  Actually,  for  the  human  eye  even  15 
-25  frames  per  second  is  considered  smooth  motion. 

Using  “lossy”  compression  algorithms  in  H.261  has  provided  a  major  advantage  in 
dealing  with  the  bandwidth  constraints  of  various  transmission  mediums,  without  losing  any 
significant  picture  quality. 
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1. 


H.261  Structure 


Figure  3-1  depicts  a  flow  diagram  of  a  typical  H.261  standards  based  system 
encoder. 


Figure  3-1  Encoder  Flow  Diagram  [Jin,  96] 

Except  for  the  first  frame,  when  a  picture  sequence  is  sent  to  the  encoder,  it 
figures  out  whether  the  reference  frame  is  going  to  be  from  the  present  picture  frame  or  the 
previous  frame.  If  the  reference  frame  is  the  present  frame,  intra-frame  (I-coding)  will  be 
performed.  When  using  I-coding,  the  data  will  go  directly  through  a  discrete  cosine 
transform  (DCT)  where  it  will  be  transformed  from  the  spatial  to  the  frequency  domain. 
The  DCT  coefficients  are  then  sent  to  a  quantizer  where  each  coefficient  is  expressed  as  a 
level  from  a  finite  number  of  predetermined  levels.  After  quantization,  a  decision  is  made  to 
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determine  if  the  current  macroblock  (8x8  pixel  array)  is  valid.  Bit-rate  control  is 
performed,  and  eventually  the  bits  are  encoded  and  transmitted. 

If  the  reference  frame  is  going  to  be  from  the  previous  frame,  inter-frame  (P-coding) 
is  performed.  Here,  a  motion  vector  search  is  performed  to  determine  the  right  direction  to 
begin  the  search  for  the  nearest,  most  similar  macroblock  between  the  current  (target)  and 
previous  (reference)  frames.  After  a  match  is  found,  either  no  filtering  (subtract  the  pixel 
values  of  the  matched  macroblock  in  the  previous  frame  from  those  in  the  current 
macroblock)  or  filtering  (subtract  the  pbcel  values  of  the  filtered  matched  macroblock  in  the 
previous  frame  from  those  in  the  current  macroblock)  is  performed.  A  differential  pulse 
code  modulator  (DPCM)  codes  the  difference  between  the  successive  values  instead  of 
coding  the  actual  values. 


Figure  3-2  H.261  Data  Structure  [Jin,  96] 


As  shown  in  Figure  3-2,  an  H.261  video  sequence  begins  with  the  picture  frames, 
followed  by  Group  of  Blocks  (GOB),  Macroblocks,  and  Blocks.  Each  picture  is  divided 
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into  twelve  or  three  GOBs  for  the  GIF  or  QCIF  frame  format,  respectively.  Thirty-three 
macroblocks  are  organized  in  a  fixed  11x3  format  to  form  a  GOB. 

As  shown  m  Figure  3-3,  each  macroblock  consists  of  four  8x8  luminance 
(brightness)  and  two  8x8  chrominance  (color)  blocks. 


2.  Discrete  Cosine  Transform  (DCT) 

H.261  uses  Discrete  Cosine  Transform  (DCT),  a  form  of  frequency 
transformation  which  converts  a  signal  from  its  spatial  domain  to  its  frequency  domain  in 
order  to  take  advantage  of  the  spatial  and  temporal  redundancy  in  the  picture  data.  Spatial 
redimdancy  keeps  track  of  the  similarities  in  information  in  the  same  picture  frame.  It  relies 
on  a  small  number  of  bits  to  describe  areas  (pixels)  on  a  picture  that  are  the  same  color, 
therefore  eliminating  the  need  to  code  each  pixel  for  every  transmission  of  data  across  the 
channel.  Temporal  redundancy  takes  advantage  of  similarities  of  information  between 
adjacent  frames  in  a  group  of  moving  pictures,  therefore  only  pixels  that  have  changed  from 
one  frame  to  the  next  are  transmitted.  Since  the  eye  is  more  receptive  to  luminance  than  is 
to  chrominance,  bit  representations  of  luminance  both  contain  more  bits  and  are  sampled 
more  frequently  than  the  color  components,  which  tend  to  be  noisy. 
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In  H.261,  a  two-dimensional  DCT  is  performed  on  8  x  8  pixel  blocks  (luminance  and 
chrominance).  Unlike  the  Discrete  Fourier  transform,  all  multiplications  in  the  DCT  use 
only  real  values,  thus  lowering  the  number  of  required  computations.  The  8  x  8  array  is 
inputted  into  the  DCT,  and  the  output  is  an  8  x  8  array  of  DCT  integer  coefficients,  with  the 
number  of  nonzero  values  significantly  decreased.  This  reduction  in  nonzero  values  is  only 
the  first  part  of  the  compression.  For  most  images,  much  of  the  signal  energy  is  in  the  lower 
fi-equencies,  which  appear  in  the  upper  left  comer  of  the  DCT  array.  The  lower  right  values 
represent  higher  fi-equencies,  and  are  often  small  enough  to  be  neglected  with  little  visible 
distortion.  Figure  3-4  is  mathematical  model  of  the  two-dimensional  Discrete  Cosine 
Transfom  (DCT). 


Figure  3-4  Two  dimensional  Discrete  Cosine  Transform  [Jin,  96] 


3.  Quantization 

The  degree  of  quantization  determines  the  image  quality.  A  large  quantization  step 
size  can  produce  unacceptably  large  image  distortion.  Similarly,  too  fine  a  step  size  can  lead 
to  lower  compression  ratios.  The  key  challenge  is  to  qaimtize  the  DCT  coefficients  the  most 
efficiently.  H.261  does  this  by  taking  advantage  of  the  limitations  in  the  human  eye’s  ability 
to  discern  high  fi-equencies.  The  quantization  matrix  is  an  8  x  8  matrix  of  step  sizes 
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(quantums),  which  provides  an  element  for  each  DCT  coefficient.  As  mentioned  previously, 
step  sizes  in  the  upper  left  (lower  frequencies)  of  the  DCT  array  are  small  and  are  large  in 
the  lower  right  (high  frequencies).  The  quantizer  divides  the  DCT  coefficients  by  their 
corresponding  quantum  and  then  rounds  to  the  nearest  integer.  Large  quantums  drive  the 
small  coefficients  down  to  zero,  with  the  result  that  many  high-frequency  components  easier 
to  encode.  The  low-frequency  components  undergo  only  minor  adjustments.  Eventually, 
only  the  nonzero  DCT  coefficients  that  survive  the  quantization  stage  are  encoded  and 
transmitted.  This  quantizing  is  somewhat  analogous  to  Mu-law  and  A-law  non-uniform 
quantization,  where  the  voice  frequencies  at  the  lower  amplitudes  (which  we  are  more  likely 
to  encounter)  will  be  conditioned  to  provide  more  information  at  a  slight  cost  to  information 
at  higher  amplitudes. 


4.  Motion  Compensation  and  Estimation 

When  the  motion  of  the  source  is  generally  limited,  it  is  very  likely  that  the 
luminance  and  chrominance  blocks  are  not  that  much  different  between  successive  picture 
frames.  In  H.261,  motion  prediction  is  done  on  the  luminance  channel  on  blocks  of  16  x  16 
pixels.  There  are  two  aspects  that  cover  these  similarities;  motion  prediction  and  motion 
compensation.  Motion  prediction  is  performed  at  the  encoder  to  determine  what  the  motion 
vector  should  be,  whereas  motion  compensation  consists  of  moving  blocks  of  data  around, 
based  upon  that  motion  vector.  As  shown  in  Figure  3-5,  by  vectoring  the  reference  block 
and  comparing  its  bit  structure  with  the  bit  structure  in  the  target  block,  it  looks  for  the 
closest  match.  Consequently,  only  the  difference  in  the  pixel  values  between  the  current 
macroblock  and  its  matched  macroblock  are  encoded. 
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The  reason  why  motion  compensation  is  effective  is  because  it  moves  only  the 
section  of  video  where  motion  has  occurred,  rather  the  entire  video  area  for  every  frame. 
Essentially  each  frame  can  be  reasonably  coded  by  detecting  the  changes  (which  are  usually 
very  small)  from  the  previous  one.  This  functions  is  a  very  important  aspect  in  lowering  the 
bit  rate.  Before  a  reference  frame  can  be  established,  intra-frame  coding  must  be  done. 


Figure  3-5  P-coding  (interframe)  [Jin,  96] 


Figure  3-6  shows  how  each  macroblock  is  intra-frame  encoded.  The  intra-frame  is 
used  as  an  accessing  point.  Figure  3-7  shows  the  frame  sequencing  used  in  H.261. 
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Figure  3-7  H.261  frame  sequence  encoding  [Jin, 96] 


After  quantization,  it  is  not  unusual  for  more  than  half  of  all  of  the  DCT  coefficients 
to  be  equal  to  zero.  One  coding  scheme,  run-length  coding,  is  used  to  take  advantage  of 
this.  In  run  length  coding,  except  for  the  DC  coefficients  of  the  intra-coded  blocks,  all  DCT 
coefficients  are  encoded  using  the  run-length  algorithm  in  a  ziz-zag  ftishion,  as  shown  in 
Figure  3-8.  For  each  non-zero  value,  the  number  of  zeros  that  preceded  the  number  and  the 
amplitude  of  the  number  itself  form  a  pair.  If  the  last  nonzero  value  does  not  happen  to  be 


the  last  coefficient  in  the  block,  an  End-of-Block  code  is  attached  to  tell  the  decoder  that 
there  are  no  more  nonzero  coefficients  left  in  the  8  x  8  block. 


The  coded  pair  will  then  go  through  a  variable  length  encoding  where  each  pair  has 
its  own  code  word,  assigned  through  a  variable  length  code.  The  basic  idea  is  to  assign 
shorter  code  words  to  represent  more  frequently  occurring  values  and  longer  code  words  to 
the  less  frequent  values,  in  order  to  compress  data  even  fiirther.  Huffinan  coding  is  the  most 
common.  Many  Huffman  tables  used  for  different  types  of  data  are  specified  in  the  H.261 
standard. 

H.261  is  only  the  baseline  video  compression  standard  for  videoconferencing.  There 
are  many  faster  and  more  efficient  codecs,  which  are  H.261  compliant,  that  use  their  own 
proprietary  algorithms.  Nevertheless,  even  a  minimum  H.261  compliant  codec  can  jffovide 
tremendous  compression  ratios  (well  beyond  100:1).  The  table  in  Figure  3-9  shows  how 
well  data  rates  can  be  increased  with  a  100:1  data  compression  using  H.261  compression 
standard. 
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BIT  RATES  REQUIRED  TO  TRANSMIT  COMPRESSED  VIDEO  IN  CIF 
AND  QCIF  FORMAT 


Frames  per  second 


Figure  3-9  Frame  Rate  vs.  Bit  Rate  for  compressed  data 


5.  MPEG 

Many  of  the  compression  techniques  used  in  the  H.261  standard  are  similar  to  those 
used  in  the  MPEG-1,  but  there  are  three  major  differences:  data  structure,  coding  type,  and 
frame  ordering  [Zin,  96].  Because  MPEG  is  targeted  for  more  bandwidth-intensive 
applications  than  H.261,  this  thesis  will  not  provide  and  in-depth  description  of  MPEG 
standards. 


a  AUDIO  COMPRESSION 

Audio  compression  standards  are  the  most  important  function  of  videoconferencing 
systems,  across  all  generations.  Currently,  Mu-law  and  A-law  are  the  most  common 
compression  techniques  used  to  condense  audio  data  utilized  in  videoconferencing  systems. 
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Both  are  non-uniform  pulse  code  modulation  (PCM)  encoding  techniques  that  use  the 
quantized  values  of  the  samples  in  order  present  a  discrete  representation  of  the  audio  signal. 
Each  sample  represents  a  code  word  that  is  8  bits  in  length.  Mu-law  and  A-law 
transformations  allow  8  bits  per  sample  to  represent  the  same  range  of  values  that  would  be 
achieved  with  14  bits  per  sample  using  uniform  PCM,  which  translates  to  a  compression 
ratio  of  approximately  1.75:1.  Due  to  the  logarithmic  nature  of  the  transformation,  the  low 
amplitude  samples  are  encoded  with  greater  accuracy  than  the  higher  samples. 

Major  techniques  that  are  designed  for  audio  signals: 

•  G.711 -48  -  64  Kbps  Narrow-band 

•  G.722  -  48  -  64  Kbps  Wide-hand 

•  G.723  -  Speech  coding  at  5.3/6.4  Kbps 

•  G.728  -  16  Kbps  Narrow-band 

rrU-T  recommendation  G.711,  “Pulse  code  modulation  of  voice  frequencies” 
provides  telephone  quality  audio  (narrow-band  3khz). 

G.722  provides  stereo  quality  (wide-hand  7khz).  At  a  typically  higher  data  rate, 
usually  >  256  Kbps,  it  provides  the  best  audio  quality  available.  [VTE,  95]  G.722  uses 
adaptive  differential  pulse  code  modulation  (ADPCM),  which  uses  predictive  algorithms  to 
predict  the  values  of  adjacent  samples.  It  uses  the  difference  between  the  predicted  and 
actual  sample  and  encodes  the  difference.  The  adaptive  part  is  because  the  encoders  can  also 
adapt  to  changing  quantizing  or  prediction  parameters.  ADPCM  generally  achieves  ratios  of 
2:1  as  compared  with  Mu-law  or  A-law  1.75:1.  G.722  has  three  modes  of  operation:  64, 
56,  and  48  Kbps.  If  a  64  Kbps  communication  channel  id  used,  48  or  56  Kbps  modes  will 
have  an  additional  8  or  16  Kbps  of  bandwidth  for  other  data. 
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For  audio  over  narrow-band  POTS  lines,  there’s  G.723,  which  supports  a 
compressed  3.4khz  signal.  If  defines  speech  coding  for  audio  transmitted  at  5316  A  Kbps. 

G.728  provides  narrow-band  audio,  which  is  important  for  lower  bit  rates  <  256 
Kbps.  It  is  designed  specifically  for  speech  signals.  G.728  uses  another  type  of  predictive 
coding  called  code  excited  linear  prediction  (CELP),  which  requires  a  bandwidth  of  16  Kbps 
and  is  very  computationally  complex,  requiring  special  hardware. 

As  described  in  H.320,  if  two  different  classes  of  audio  compression  are  used,  the 
less  capable  of  the  two  will  be  used.  For  example,  if  a  Class  3  system  (G.728)  establishes  a 
call  with  a  Class  1  system,  the  audio  will  be  G.71 1.  [VTEL,  95] 


H.  DATA  STANDARDS 

The  T.120  standard  focuses  on  collaborative  computing,  common  whiteboard,  and 
applications  sharing  during  any  H.32x  videoconference.  It  defines  the  communication  and 
application  protocols  and  services  that  support  real-time  multipoint  data  communications. 
The  specification  also  allows  data-only  T.120  sessions,  when  no  video  communications  are 
required.  In  addition,  T.120  supports  multipoint  meetings  with  participants  using  different 
transmission  media.  T.120  recommendations  include: 

•  T.  1 22  Multipoint  Communication  Service 

•  T.  123  Network  Specific  Transport  Protocols 

•  T.  124  Generic  Conference  Control 

•  T.  126  Still  Image  Exchange 
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I.  SUMMARY 


As  network  architectures  have  evolved,  newer  standards  are  continually 
implemented.  But  in  order  to  provide  cross-platform  capability,  flexibility,  scalability,  and 
accommodation  of  newer  technologies  as  they  emerge,  the  protocols  and  standards  used  in 
videoconferencing  for  distance  learning  must  be  compatible  with  the  standards  from  the 
International  Standards  bodies.  These  standards  should  be  the  baseline  used  in 
videoconferencing  systems  for  distance  learning. 

Using  commonly  available  software  codecs,  not  only  will  network  bandwidth 
improve  over  already  strained  data  pipes,  but  the  allows  for  storing  more  data  in  a  PC’s 
storage  device(s).  This  provides  the  ability  for  more  course  material  to  be  streamed-on- 
demand,  providing  the  asynchronous  capability  necessary  for  distance  learning  to  sea. 

Although  software  video  codecs  lack  the  compression  speed  of  dedicated  codecs, 
they  have  the  advantage  of  low  cost.  Furthermore,  more  powerful  processors  like  Intel’s 
Pentium  n  with  MMX  technology  improve  video  compression/decompression. 
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IV.  IP  MULTICASTING  AND  THE  MBone 


A.  INTRODUCTION 

This  chapter  focuses  upon  multicasting  videoconferencing  sessions  over  IP-based 
networks.  It  must  be  noted  however  that  IP  is  very  flexible.  It  can  be  used  over  a  variety  of 
network  segments,  including  ATM,  frame  relay,  switched  multimegabit  data  service 
(SMDS),  satellite,  dial-up  asynchronous,  and  ISDN.  This  chapter  also  discusses  the  major 
protocols  supported  by  The  IP  Multicast  Initiative  (IPMI).  Founded  in  1996,  the  IPMI  is  a 
multi-vendor  cooperative  effort  to  promote  the  deployment  of  industry-standard  IP 
Multicast  technology,  many  of  which  are  IETF  Requests  for  Comment  (RFC).  Many 
members  are  leaders  in  the  high  technology  industry  including  IBM,  Intel,  Microsoft,  Cisco 
Systems,  Silicon  Graphics,  and  GTE,  among  others. 


B.  BACKGROUND 

As  shown  in  Chapter  11,  videoconferencing  compression  algorithms  help  reduce 
network  bandwidth  requirements,  allowing  videoconference  applications  to  deliver  real¬ 
time,  quality  video  and  audio  data  across  networks.  But  compression  solves  only  one  area 
of  the  bandwidth  issue.  For  example,  what  if  a  videoconferencing  application  needed  to 
send  data  to  multiple  hosts  simultaneously?  One  way  to  accomplish  that  task  would  be  to 
retransmit  identical  IP  packets  to  each  recipient.  If  there  are  many  recipients,  this  could 
potentially  strain  the  network.  To  avoid  this  problem,  the  Internet  Engineering  Task  Force 
(IETF),  an  arm  of  the  Internet  Architecture  Board  (lAB)  that  approves  Internet  standards. 
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endorsed  IP  multicast  as  a  standards-based  solution  to  this  problem.  There  are  two  items 
that  make  multicasting  practical  on  the  Internet.  They  are  the  lack  unlimited  bandwidth  on 
the  Internet  backbone  connections,  and  the  widespread  availability  of  workstations  across  a 
wide  global  network  infrastmcture  [Macedonia,  Brutzman  94]. 


C  IP  MULTICASTING 

RFC  1112,  “Host  Extensions  for  IP  multicasting,”  authored  by  Steve  Deering  in 
1989,  was  designed  as  an  extension  of  IP  Version  4.  It  is  described  as  “the  transmission  of 
an  IP  datagram  to  a  “host  group”,  i.e.  a  set  of  zero  or  more  hosts  identified  by  a  single  IP 
destination  address  [Johnson,  97].  IP  multicast  allows  applications  to  send  data  over  the 
Internet  to  many  simultaneous  recipients  in  a  more  economical  fashion  than  unicast  or 
broadcast  IP  transmissions.  Unicast  IP  is  from  a  single  source  to  a  single  destination  (one- 
to-one),  so  in  order  to  send  information  to  multiple  recipients  using  unicast,  an  application 
needs  to  send  multiple  copies  of  IP  datagrams,  which  might  saturate  the  transmission 
medium.  Broadcast  IP  sends  data  to  all  of  the  participants  in  a  network  whether  they  want  it 
or  not. 

When  Internet  Protocol  (IP)  was  developed.  Class  D  IP  addressing  was  designed  to 
facilitate  multicasting.  Unlike  unicast  IP  addresses,  which  identify  specific  destinations. 
Class  D  addresses  identify  a  particular  transmission  session.  Class  D  addresses  are  reserved 
for  groups  rather  than  individual  hosts.  The  addresses  range  from  224.0.0.0  to 
239.255.255.255. 
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There  are  also  certain  special  addresses  (listed  in  RFC  1700  -  “Assigned  Numbers”): 

•  224.0.0.1,  the  “all  host  group”  --  addresses  all  multicast  hosts  on  a  directly 
connected  net. 

•  224.0.0.2  addresses  all  routers  in  a  LAN. 

•  224.0.0.0  through  224.0.0.225  is  reserved  for  routing  protocols  and  other  low- 
level  topology  discovery  or  maintenance  protocols. 

•  224.0. 1 .3  through  224.0. 13.255  is  reserved  for  Network  News. 

With  IP  multicast,  the  source  application  is  not  necessarily  aware  of  the 
destinations.  Multicast  applications  send  one  copy  of  an  IP  packet  over  the  network  to  a 
group  address.  A  group  of  receivers  may  then  participate  by  joining  the  particular  multicast 
session  group.  The  multicast  IP  datagram  is  delivered  to  all  members  of  its  destination 
host  group  (group  Class  D  address)  with  the  same  ‘best  effort’  reliability  as  regular  unicast 
IP  datagrams  [Johnson,  97]. 

Some  of  the  rudimentary  requirements  of  IP  multicast  are: 

•  Since  hosts  may  leave  or  join  a  group  at  anytime,  membership  in  a  host  group  of 
an  IP  multicast  session  must  be  dynamic. 

•  There  should  be  no  restrictions  on  the  location  and  number  of  groups  that  can 
participate. 

•  At  the  application  level,  a  host  may  have  multiple  data  streams  on  different  port 
numbers,  on  different  sockets,  in  one  or  more  applications  [Johnson,  97]. 

The  minimal  hardware/software  requirements  needed  to  deliver  IP  multicasts  end-to- 
end  are: 
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•  Support  for  IP  multicast  transmission  and  reception  in  the  host’s  TCP/IP  protocol 
stack  and  operating  system.* 

•  Software  supporting  Internet  Group  Management  Protocol  (IGMP),  in  order  to 
communicate  requests  to  join  a  multicast  groups(s),  and  receive  multicast  traffic. 

•  Network  interface  cards  that  efficiently  filter  for  LAN  data  link  layer  addresses 
that  are  mapped  from  network  layer  IP  multicast  addresses. 

•  IP  multicast  application  software  such  as  videoconferencing  or  file  transfer.  The 
end-node  applications  should  be  flexible  in  terms  of  their  support  for  existing 
compression  technologies  and  accommodation  of  newer  technologies  as  they 
emerge. 

•  Intermediate  routers  between  the  sender(s)  and  receivers(s)  must  be  IP  multicast- 
capable.^ 

•  Firewalls  (i.e.  packet-filtering  software)  may  need  to  be  reconfigured  to  permit 
IP  multicast  traffic.  [Johnson,  97] 

Figure  4-1  is  an  overview  of  the  requirements. 


i  lMticast'traiitelraater  Jii  Nteinn9cnd>tlMort  ICMf»lBiiri«a)ittdiMa89tp«l^ 


Figure  4-1  Requirements  for  IP  Multicasting 


’  Windows  NT,  Windows  95,  and  the  latest  versions  of  UNIX  support  IP  multicast. 

^  Multicasting  capability  can  be  enabled  in  most  routers  by  simply  updating  the  software  and  adding  memory. 
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When  a  host  application  requests  membership  in  the  host  group  associated  with  a 
particular  multicast  session,  the  request  is  communicated  to  the  subnet’s  multicast  router 
and,  if  necessary,  on  to  intermediate  routers  between  the  sender  and  receiver.  When  the 
requested  session  is  found,  the  router  delivers  the  requested  incoming  multicast  IP 
datagrams  to  the  requesting  host,  passing  it  to  the  TCP/IP  stack,  which  makes  the  data 
available  as  input  to  the  user’s  application.  Other  stations  filter  out  multicast  packets  at  the 
hardware  level. 

Multicast  routers  do  not  need  to  know  the  list  of  member  hosts  for  each  group.  It 
only  requires  knowing  a  group  for  which  there  is  one  member  on  the  subnet.  A  multicast 
router  attached  to  an  Ethernet  need  associate  only  a  single  Ethernet  multicast  address  with 
each  host  group  having  a  local  member. 


1.  IP  Multicast  Protc»cols 

Like  any  other  means  of  transporting  data  over  network  infrastructures,  IP  multicast 
comes  with  an  array  of  protocols  that  help  provide  the  framework  for  multicasting  IP 
datagrams.  The  most  fundamental  of  IP  multicast  protocols,  Internet  Group  Management 
Protocol  (IGMP  Ver.  2),  described  in  RFC  2236,  is  used  by  multicast  routers  in  order  to 
learn  the  existence  of  host  group  memberships.  It  is  the  baseline  protocol  necessary  to 
conduct  an  IP  multicast  session. 

The  protocols  used  to  ensure  that  the  needed  bandwidth  and  QoS  are  available 
include  Real-Time  Transport  Protocol  (RTP),  Real  -Time  Control  Protocol  (RTCP),  Real- 
Time  Streaming  Protocol  (RTSP),  and  Resource  Reservation  Protocol  (RSVP).  There  area 
also  associated  routing  protocols  such  as  Protocol  Independent  Multicast  (PM),  Multicast 
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Open  Shortest  Path  First  (MOSPF),  and  Distance  Vector  Multicast  Routing  Protocol 
(DVMRP). 

There  are  also  transport  issues  that  need  to  be  addressed  with  IP  multicast. 
Applications  that  are  BP  multicast  capable  are  not  designed  for  use  with  reliable,  connection- 
oriented  transports  (TCP),  therefore  layer  3  does  not  invoke  destination  addresses  in  the 
datagrams.  They  also  do  not  require  guaranteed  in-sequence  delivery  of  IP  packets. 
Furthermore,  since  the  delivery  of  IP  will  not  have  a  fixed  path,  there  is  no  assurance  that 
the  bandwidth  needed  for  video  and  audio  will  be  available.  Videoconferencing 
applications  are  better  off  tolerating  missing  data  than  overcoming  the  lengthy  delays  caused 
by  TCP  retransmissions.  Therefore,  a  simpler  transport  framework,  such  as  User  Datagram 
Protocol  (UDP),  a  transport  layer  protocol  that  only  provides  error  detection,  does  a  more 
than  an  adequate  job  of  transporting  videoconferencing  data. 

a.  Internet  Group  Management  Protocol  (IGMP) 

Internet  Group  Management  Protocol  (IGMP)  performs  two  main  functions. 
It  is  used  by  hosts  to  join  IP  multicast  sessions,  and  by  multicast  routers  to  learn  the 
existence  of  host  group  members  on  their  directly  attached  subnets,  identify  designated 
multicast  routers  in  a  LAN,  and  propagate  group  information  over  the  Internet.  It  is  loosely 
analogous  to  Internet  Control  Message  Protocol  (ICMP),  which  is  used  in  PING 
applications.  [Johnson,  97] 

Each  multicast  router  sends  IGMP  queries  (Host  Membership  Query),  and 
the  hosts  respond  by  reporting  their  host  group  memberships  (Host  Membership  Report). 
This  query  and  response  session  is  accomplished  by  IGMP  messages  encapsulated  in  IP 
datagram  packets.  To  determine  if  any  hosts  on  a  local  subnet  belongs  to  multicast  group. 
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one  multicast  router  per  subgroup  periodically  sends  a  hardware  (data  link  layer)  multicast 
IGMP  Host  Membership  Query  (network  address  224.0.0.1)  to  all  IP  end  nodes  on  its 
subnet.  This  message  asks  them  to  report  back  on  the  host  group  memberships  of  their 
processes.  These  query  messages  have  a  time  to  live  (TTL)  of  1  to  limit  their  transmission  to 
the  network  directly  attached  to  the  router.  [Petitt,  96] 

Each  host  then  sends  back  one  IGMP  Host  Membership  Report  to  the  group 
address,  so  that  all  group  members  see  it.  When  hosts  see  a  Host  Membership  Report  for 
the  group  transmitted,  they  cancel  their  own  transmission.  Hence,  only  one  member  of  the 
group  will  report  membership  to  the  router  for  a  particular  group  address.  Periodically, 
local  multicast  routers  will  send  IGMP  Host  Membership  Queries  to  the  “all  hosts”  group,  to 
verify  current  memberships.  Although  IGMP  packets  are  routinely  transmitted,  compared  to 
the  multicast  application’s  traffic,  its  bandwidth  use  is  insignificant.  Figure  4-2  shows  an 
IGMP  request  on  a  LAN. 


Figure  4-2  IGMP  Messages  on  a  LAN  [Johnson,  9T\ 
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When  the  last  station  on  a  subnet  leaves  a  multicast  group,  the  router 
“prunes”  the  multicast  data  stream  associated  with  it  by  ceasing  to  forward  the  data  stream 
to  subnet. 

b,  Real-Time  Transport  Protocol  Version  2  (RTP) 

Real-Time  Transport  Protocol  (RTP),  defined  in  RFC’s  1889  and  1890, 
provides  end-to-end  delivery  services  to  support  applications  transmitting  real-time 
data.[Johnson,97]  Among  the  services  that  RTP  provide  are  payload  type  identification, 
packet  sequence  numbering,  and  time  stamping.  The  delivery  of  RTP  packets  is  monitored 
by  Real-Time  Control  Protocol  (RTCP),  which  is  discussed  later. 

RTP  does  not  provide  all  of  the  typical  functionality  of  typical  transport 
protocols.  It  is  a  header  format  running  in  combination  with  other  transport  protocols  in 
order  to  take  advantage  of  their  functionalities.  The  RTP  header  provides  timing 
information  to  synchronize  and  display  audio  and  video  data,  and  also  to  determine  if 
packets  are  lost  or  arrive  out  of  order.  In  order  to  allow  multiple  data  and  compression 
types,  the  header  specifies  the  payload  type  by  characterizing  what  type  of  audio  and  video 
encoding  is  carried  in  the  RTP  packet.  This  enables  users  to  have  the  option  to  change  the 
encoding  methods  during  a  conferencing  session,  in  response  to  network  congestion,  or  to 
accommodate  low-bandwidth  requirements  of  a  new  conference  participant  [Johnson,  97]. 

RTP  does  not  ensure  timely  delivery  or  provide  QoS  guarantees.  It  does  not 
guarantee  delivery  or  prevent  out-of  order  delivery,  nor  does  it  assume  that  the  underlying 
network  is  reliable.  For  applications  like  videoconferencing  that  require  these  types  of 
guarantees,  RTP  must  be  accompanied  by  other  mechanisms  [Johnson,  97]. 
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Real-Time  Control  Protocol  (RTCP) 


c. 

RTCP,  also  standardized  in  RFC’s  1889  and  1890,  is  a  control  protocol  that 
works  in  conjunction  with  RTP.  The  information,  periodically  transmitted  by  each 
participant  in  an  RTP  session  to  all  other  participants,  is  used  by  the  applications  to  control 
the  performance  of  the  conference  and  for  diagnostic  purposes. 

RTCP  performs  four  primary  functions,  a)  First,  RTCP  provides  feedback 
information  about  the  quality  of  the  transmission  to  the  applications.  The  statistics  include 
the  number  of  packets  sent,  the  number  of  packets  lost,  interval  jitter,  etc.  b)  RTCP  also 
identifies  the  RTP  source  address  through  its  transport-level  identifier  called  the  canonical 
name  (CNAME).  The  CNAME  is  used  to  keep  track  of  participants  in  a  session  in  order  to 
synchronize  audio  and  video,  c)  RTCP  controls  its  transmission  intervals  in  order  to 
prevent  control  traffic  from  overwhelming  network  resources.  RTCP  control  traffic  is 
limited  to  five  percent  of  the  overall  session  traffic.  This  control  on  RTCP  allows  RTP  to 
scale  up  to  a  large  number  of  session  participants,  d)  An  optional  function  can  be  used  to 
convey  a  small  amount  of  information  to  all  session  participants.  In  distance  learning,  this 
information  can  be  used  to  identify  the  participants  in  a  particular  training  session.  For 
example,  RTCP  might  carry  a  personal  name  to  identify  a  participant  on  the  user’s  display. 
[Johnson,  97] 

Since  RTCP  sends  feedback  to  all  of  the  recipients  of  a  multicast  stream, 
individual  users  can  determine  if  a  problem  is  specific  to  the  local  end  node  or  system-wide. 
RTP  and  RSVP  information  is  simply  data  from  the  point  of  view  of  the  routers  that  move 
the  packets  to  their  destinations.  To  prioritize  data  streams  and  provide  a  guaranteed  quality 
of  service,  other  protocols  must  be  used  [Steinke,  96]. 
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Resource  Reserw&tion  Protocol  (RSVP) 

In  an  Internet  environment  with  a  myriad  of  routers  and  switches,  packet 
queuing  can  lead  to  variable  packet  delivery  delays  in  different  parts  of  the  network.  QoS 
considerations  for  a  multicast  application  include  tolerance  to  jitter,  delay,  and  lost  packets. 
In  order  for  the  network  to  provide  QoS,  applications  must  be  able  to  reserve  and  control 
network  services  [Johnson,  97].  This  is  not  an  issue  on  networks  with  sufficient  bandwidth, 
but  considering  the  packet-based  networks  targeted  for  use  in  this  thesis,  QoS  is  a  major 
issue. 

The  Resource  Reservation  Protocol  (RSVP)  is  a  draft  protocol  for  resource 
reservation,  still  under  development  [Hurwitz,  97].  Elementary  RSVP  requests  consist  of 
dynamic  request  specifications  for  end-to-end  desired  QoS  and  definitions  of  the  set  of  data 
packets  to  receive  the  QoS.  It  aims  to  efficiently  set  up  a  guaranteed  QoS  resource 
reservation,  supporting  unicast  and  multicast  routing  protocols,  and  is  expected  to  scale  well 
for  large  multicast  delivery  groups.  RSVP  is  useful  in  environments  where  QoS 
reservations  can  be  supported  by  reallocating  (rather  than  adding)  resources.  In  IP 
multicast,  a  host  sends  an  IGMP  message  to  join  the  group  and  then  sends  an  RSVP 
message  to  reserve  resources  along  the  delivery  path(s)  of  that  group.  The  RSVP  service 
request  is  initially  sent  to  a  local  server.  The  local  server  will  validate  the  request  and  then 
forward  the  request. 

RSVP  promises  access  to  Internet  integrated  services.  The  hosts  and  the 
network  work  together  to  achieve  guaranteed  quality  of  end-to-end  transmission.  However, 
in  order  to  achieve  end-to-end  QoS,  all  hosts,  routers  and  other  network  infrastructure 
elements  between  the  receiver  and  sender  must  support  RSVP.  They  must  reserve  system 
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resources  such  as  bandwidth,  CPU  and  memory  buffers  in  order  to  satisfy  QoS  reservations. 
RSVP  rides  on  top  of  IP,  and  is  used  by  routers  to  deliver  QoS  control  requests  to  all  nodes 
along  the  path(s)  and  to  establish  and  maintain  statistics  in  order  to  provide  the  requested 
services.  After  the  reservation  has  been  made,  the  router  supporting  RSVP  determines  the 
route  and  QoS  class  for  each  incoming  packet  and  the  scheduler  makes  forwarding  decisions 
for  every  outgoing  packet  [Johnson,  97]. 

Since  RSVP  is  receiver  initiated,  resource  requests  are  in  only  one  direction. 
At  each  node  along  the  reverse  path  to  the  receiver,  RSVP  attempts  to  make  a  resource 
reservation  for  the  requested  stream.  This  receiver-initiated  propagation  delivers  control 
messages  only  up  to  the  node  of  the  spanning  tree  where  they  merge  with  another 
reservation  for  the  same  source  stream,  thus  preserving  bandwidth.  This  receiver  initiation 
achieves  two  goals:  scalability,  because  the  receiver-initiated  joining  delivers  control 
messages  only  along  those  parts  of  the  tree  that  need  the  information;  and  heterogeneity, 
because  of  the  receiver  orientation,  individual  receivers  can  choose  to  participate  and 
request  different  levels  of  reservation.  [Precept,  97] 

Based  upon  the  admission  and  policy  controls  of  the  underlying  hardware,  at 
each  node,  one  of  two  general  actions  take  place:  The  host  makes  a  reservation  or  forwards 
the  request  upstream.  These  controls  are  not  a  part  of  RSVP,  but  are  utilized  by  the 
equipment.  Admission  controls  determine  whether  the  node  has  sufficient  resources,  and 
the  policy  control  determines  whether  the  user  has  authorization  to  make  a  reservation.  If 
the  reservation  is  rejected,  RSVP  returns  an  error  message  to  the  appropriate  receiver(s).  If 
accepted,  the  node  is  configured  to  provide  the  desired  QoS.  If  the  RSVP  request  is 
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forwarded  upstream,  it  continues  to  propagate  along  the  reverse  path  towards  the  appropriate 
senders.  [Johnson,  97] 

One  drawback  of  RSVP  is  the  computational  requirements  required  by 
routers  to  inspect  and  handle  packets  in  a  priority  order.  Approaches  such  as  tag  switching 
are  being  developed  to  help  with  this  drawback.  Another  area  of  research  is  enhancing 
RSVP  to  use  routing  services  that  provide  alternate  and  fixed  paths.  Finally,  RSVP  has  no 
way  to  handle  network  overload  that  may  occur  if  multiple  users  request  the  maximum 
bandwidth  at  the  same  time  [Andrews,  97]. 

RSVP  continues  to  be  under  review  by  the  Internet  Engineering  Task  Force 
(IETF),  and  is  not  widely  deployed.  Similar  work  has  been  done  on  Internet  Protocol- 
Version  6  (IPv6)  to  support  resource  reservation  and  flow  set  up  for  multicasting.  Figure  4-3 
is  an  illustration  of  the  method. 
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Figure  4-3  RSVP  Protocol  [Johnson^  97] 


e.  Real-Time  Streaming  Protocol  (RTSP) 

RTSP  is  considered  more  of  a  framework  than  a  protocol.  It  works  at  the 
application  level  for  unicast  and  multicast  streaming  and  to  enable  operability  between 
different  vendors’  clients  and  servers.  RTSP  essentially  encodes  and  passes  multimedia 
stream  control  commands.  In  many  respects,  it  resembles  a  protocol  that  describes  the 
functionality  of  a  VCR  remote  control. 

2.  Reliable  IP  Multicast 

Reliable  connectivity  ensures  that  all  packets  are  received  by  all  of  the  recipients. 
For  unicast  IP  services,  error  correction  and  detection  is  provided  at  the  TCP  layer.  But 
such  traditional  techniques  for  error  detection  and  correction  in  a  large-scale  multicast 
environment  might  result  in  an  “ACK  explosion”  or  a  “NAK”  implosion,  where  the 
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excessively  large  numbers  of  acknowledgement  messages  from  large  groups  can  swamp  the 
originating  hosts  sending  the  desired  streams. 

There  are  currently  no  IETF  standards  for  reliable  IP  multicast,  but  several  Internet 
drafts  have  been  submitted  related  to  reliable  multicasting,  and  an  IRTF  (Internet  Research 
Task  Force)  working  group  has  been  formed  to  advance  reliable  multicast  standards  efforts 
[Johnson,  97]. 

Cisco’s  proposed  Pretty  Good  Multicasting  (PGM)  Reliable  Transport  Protocol  is 
intended  to  make  multicasting  appropriate  for  mission-critical  uses.  Although  this  work  is 
still  under  development,  this  protocol  can  be  useful  in  areas  such  as  common  tactical 
pictures. 

As  mentioned  previously,  videoconferencing  applications  are  able  to  tolerate  missing 
data  and  still  provide  discemable  video  and  audio.  They  also  do  not  require  guaranteed  in¬ 
sequence  delivery  of  ff  packets.  Therefore,  videoconferencing  end-systems  will  not  need 
bit-perfect,  in-order,  acknowledged  data.  For  military  purposes,  the  multicast  reliability 
requirement  is  more  essential  with  the  common  tactical  architecture  and  cooperative 
engagement  issues.  (Petitt,  96)  evaluates  the  design  choices  of  several  reliable  transport 
layer  multicast  protocols  that  support  those  requirements. 


3.  Group  Setup  Protocols 

Users  of  videoconferencing  must  not  only  know  about  upcoming  or  current  IP 
multicast  sessions,  but  also  how  to  manage  and  coordinate  them.  Parameters  for  sessions 
will  include  information  such  as  the  name  and  topic  of  the  session;  its  multicast  address; 
date,  time  and  duration;  media  types  (e.g.  audio),  media  encoding,  and  media  ports;  security 
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parameters;  etc.  There  are  currently  several  Internet  drafts  for  these  types  of  protocols,  but 
a  clear  standard  has  not  emerged.  Still  there  are  current  tools  available.  For  example,  the 
session  directory  tool,  sdr,  is  widely  used  on  the  MBone.  Similarly,  Precept’s  IP/TV 
Program  Guide  has  a  directory  embedded  in  a  Web  page. 


4.  Other  IP  Multicast  Issues 

a.  Router  Support 

As  with  routing  any  IP  datagram,  multicasting  requires  routers  to  interact 
with  each  other  and  exchange  information  about  their  neighbors.  One  item  that  should  be 
considered,  in  order  to  most  effectively  implement  IP  multicast,  is  to  determine  what  is  the 
best  possible  routing  protocol  based  upon  the  network  layout.  On  a  routed  network,  which 
includes  native  multicast,  IP  multicast  traffic  for  a  particular  source  and  destination  group  is 
typically  transmitted  via  a  spanning  tree  that  connects  all  of  the  hosts  in  the  group.  There 
are  basically  two  approaches  to  multicast  routing;  Dense-Mode  or  Sparse-Mode. 

Dense-Mode  multicast  routing  protocols  follow  an  approach  that  assumes 
that  the  multicast  group  members  are  densely  distributed  throughout  the  network  and 
bandwidth  is  abundant.  These  protocols  rely  on  periodic  flooding  of  the  network  with 
multicast  traffic  to  distribute  group  membership  information  to  all  nodes  in  the  network  in 
order  to  set  up  and  maintain  the  spanning  tree.  The  protocols  include  Multicast  Open 
Shortest  Path  First  (MOSPF),  described  in  RFC  1584,  Protocol-Independent  Multicast- 
Dense  Mode  (PIM-DM),  and  the  earlier  Distance-Vector  Multicast  Routing  Protocol 
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(DVMRP),  described  in  RFC  1075.  DVMRP  is  currently  used  on  the  MBone,  but  is 
becoming  obsolete. 

Sparse-Mode  protocols  are  based  upon  the  assumption  that  the  multicast 
group  members  are  sparsely  distributed  throughout  the  network  and  bandwidth  is  not 
necessarily  widely  available.  Flooding  in  this  case  is  not  economical  because  the  waste  of 
bandwidth  and  latency  problems  that  occur  when  transmitting  IP  over  large  geographic 
regions.  Sparse-Mode  routing  protocols  like  Core  Based  Trees  (CBT),  RFC  2189,  and 
Protocol-Independent  Multicast-Sparse  Mode  (PIM-SM),  RFC  2117,  are  possible  choices. 
They  build  a  single  distribution  tree,  which  is  formed  around  a  focal  router  (called  a  core  in 
CBT  and  rendezvous  point  in  PIM-Sparse  Mode).  Multicast  traffic  for  the  entire  group  is 
sent  and  received  over  the  same  tree,  regardless  of  the  source.  The  use  of  a  shared  tree  can 
provide  significant  bandwidth  savings  for  applications  that  have  many  active  senders. 

Another  concern  is  that  many  Internet  Service  Providers  (ISP’s)  do  not  have 
a  protocol  to  deal  with  inter-domain  multicast  routing  (IDMR).  IDMRs  such  as  Protocol 
Independent  Multicast  (PIM),  Multicast  Open  Shortest  Path  First  (MOSPF),  and  Distance 
Vector  Multicast  Routing  Protocol  (DVMRP),  were  not  designed  for  multiple  autonomous 
systems  that  do  not  necessarily  want  to  share  all  their  routing  information.  [Hurwicz,  97] 

Although  Border  Gateway  Protocol  (BGP)  provides  inter-domain  routing 
capabilities  for  IP,  there  is  no  equivalent  of  BGP  for  IP  Multicast;  Currently  the  lETF’s 
IGMP  working  group  is  developing  a  Border  Gateway  Multicast  Protocol  (BGMP)  protocol 
specification.  Until  this  shortcoming  is  addressed,  the  lack  of  an  IDMR  protocol  limits  to 
the  scalability  of  IP  Multicast,  along  with  limited  bandwidth,  is  one  of  the  major  reasons 
why  MBone  has  only  about  30,000  users.  Furthermore,  growth  will  continue  to  be  limited  if 
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all  of  the  routers  will  have  to  contain  all  of  the  routing  information  for  the  whole  network 
[Hurwicz,  97]. 

Although  new  routers  on  the  Internet  are  capable  of  supporting  multicast, 
most  are  not  ff  multicast  enabled,  by  default.  Many  ISPs  are  reluctant  to  deploy  multicast 
because  of  concerns  such  as:  cost  and  complexity  of  upgrading  older  routers,  router 
resources  consumed,  reliability  problems,  an  unclear  business  model  (how  does  an  ISP 
charge  for  traffic,  who  pays,  and  how  does  peering — communications  between  ISPs — 
work?),  and  lack  of  diagnostic/simulation/debugging  tools.  Even  with  these  concerns,  some 
ISP’s  have  already  deployed  multicast.  For  example,  UUNET  offers  IP  multicast  as  a 
value-added  service  on  its  network.  It  has  equipped  each  of  its  domestic  Point-of-Presence 
(POP)  with  multicast  routers,  in  order  to  provide  multicast  service  connections  throughout 
the  continental  United  States.  By  next  year,  expect  more  ISPs  to  begin  implementing 
multicasting,  especially  as  backbone  traffic  continues  to  rise  and  cost  threshold  of  user 
decreases. 

There  is  also  the  issue  of  incorporating  QoS  routing  with  various  multicast 
routing  protocols.  Native  IP  multicast  protocols  uses  various  approaches  to  construct 
delivery  trees  for  efficient  transmission.  But  without  additional  mechanisms,  those  routing 
approaches  are  not  guaranteed  to  provide  a  specified  QoS.  For  example,  when  QoS 
mechanisms  are  used  to  reserve  and  control  network  resources,  the  routers  must  not  only 
satisfy  the  added  QoS  requirements,  but  in  addition,  it  has  to  find  the  shortest  path  to  a 
destination  when  constmcting  a  delivery  tree. 
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b.  Other  Network  Issues 

Many  IP  multicast  implementations  have  not  been  thoroughly  tested  because 
many  organizations  have  not  enabled  multicast  capabilities  in  their  networks  [Hurwicz,  97]. 
Furthermore,  there  is  no  widely  known  data  on  how  routers  will  react  to  a  steady,  high 
volume  of  multicast  multimedia  traffic.  Because  IP  Multicast  uses  the  connectionless  User 
Datagram  Protocol  (UDP),  the  most  popular  type  of  firewall,  application  gateways,  can  not 
secure  connectionless  protocols,  essentially  rendering  IP  multicast  incompatible  with  most 
firewall  strategies.  In  some  applications,  in  order  to  allow  transmissions  through  a  firewall, 
TCP  is  used  in  conjunction  with  UDP,  by  tunneling  and  the  ported  mulitcast  routing 
program  mnning  on  a  host.  Many  firewall  applications  and  routers  will  need  to  be 
reconfigured,  replaced,  or  upgraded  in  order  to  deal  with  multicast  address  reliability  and 
bandwidth  issues. 

D.  MULTICAST  BACKBONE  (MBone) 

When  LANs,  WANs,  and  the  Internet  were  initially  developed  and  designed, 
videoconferencing  was  not  expected  to  be  a  viable  possibility.  Based  of  limited  bandwidth, 
sending  video  or  audio  was  not  considered  possible  or  practical.  However,  as  the 
technology  matured,  the  Multicast  Backbone  (MBone)  and  video/audio  compression 
techniques  were  developed  showing  that  videoconferencing  was  not  only  possible  but  also 
practical. 

The  MBone  is  an  experimental,  virtual  network  that  lies  on  top  of  the  Internet.  It 
was  initiated  in  early  1992  and  named  by  Steve  Casner  of  the  University  of  Southern 
California  Information  Sciences  Institute.  It  provides  one-to-many  and  many-to-many 
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network  delivery  services  for  multicast  capable  applications  such  as  videoconferencing. 
MBone  originated  from  a  collaboration  in  order  to  multicast  audio  and  video  from  meetings 
of  the  Internet  Engineering  Task  Force,  and  has  been  the  testbed  for  many  of  the  multicast 
protocols  mentioned  earlier,  such  as  IGMP,  RTP  etc.  MBone  is  continually  being 
developed  by  hundreds  of  researchers  who  are  designing  more  effective  and  efficient 
protocols  and  applications  for  videoconferencing.  This  section  gives  a  brief  introduction  to 
the  MBone  to  provide  an  example  of  the  viability  of  multicasting  video  and  audio  over  IP- 
based  network  architectures. 


1.  MBone  Requirements 

The  major  technical  prerequisite  that  makes  multicasting  possible  over  the  MBone  is 
the  use  of  network  routers  called  nu-outers.  Basically  mrouters  are  upgraded  commercial 
routers,  dedicated  UNIX  workstation-class  machines,  or  dedicated  UNIX  workstation-class 
machines  mnning  with  modified  kernels  in  parallel  with  standard  commercial  routers 
[Macedonia,  Brutzman,  94].  More  and  more  commercial  routers  are  now  supporting 
multicast.  This  will  help  eliminate  the  inefficiencies  and  management  headaches  of 
duplicate  routers  and  tunnels  [Macedonia,  Brutzman,  94].  The  mrouters  use  the  IGMP 
protocol  to  learn  the  existence  of  host  group  membership  on  their  directly  attached  subnets, 
to  identify  designated  multicast  routers  in  a  LAN,  and  to  propagate  group  membership 
information  over  the  MBone.  Tunneling  further  augments  MBone  by  allowing  multicast 
datagrams  to  be  forwarded  to  other  MBone  subnets  that  support  IP  multicast.  For  example, 
at  the  sending  mrouter,  IP  multicast  datagrams  are  encapsulated  by  unicast  IP  datagrams  and 
forwarded  as  unicast  IP  datagrams  so  that  intervening  unicast  routers  and  subnets  can  handle 
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them.  The  receiving  mrouters  will  “strip”  the  multicast  datagram  of  its  encapsulated  unicast 
IP  datagram  in  order  to  determine  if  any  of  its  attached  hosts  are  requesting  to  join  that 
multicast  group. 

As  mentioned  earlier,  the  overarching  issue  in  videoconferencing  is  bandwidth.  IP 
multicasting  partly  addresses  this  issue  by  enabling  one  packet  of  information  to  reach  many 
destinations.  For  example,  a  128-ldlobit  per  second  video  stream  (based  the  typical  data  rate 
of  two  channels  of  ISDN)  uses  the  same  bandwidth  whether  it  is  received  by  one  location  or 
20.  However,  there  is  one  disadvantage.  If  all  mrouters  permitted  packets  to  touch  every 
workstation  in  the  MBone,  video  streams  might  potentially  misspend  valuable  bandwidth  by 
sending  streams  to  LANs  that  are  not  participants.  For  that  reason,  controls  are  needed  to 
limit  the  propagation  of  video  stream  packets  across  the  MBone.  Controls  of  multicast 
packet  propagation  are  implemented  two  ways.  MBone  limits  the  time  to  live  (ttl)  of 
multicast  packets  or  it  uses  complex  pruning  algorithms  to  adaptively  restrict  the 
transmission  of  multicast  packets.  [Macedonia,  Brutzman,  94].  MBone  protocol 
developers  are  successfully  experimenting  with  automatically  pruning  and  grafting  subtrees, 
and  thresholds  can  set  maximum  bandwidth  limits.  The  truncation  is  accomplished  by 
setting  the  ttl  in  a  packet.  The  ttl  is  decremented,  by  one  or  more,  each  time  it  passes 
through  an  mrouter.  For  example,  if  ttl  was  set  to  16,  it  would  multicast  on  a  smaller  scale 
such  as  a  school  campus.  If  the  ttl  was  128,  it  could  potentially  traverse  most  of  the  subnets 
on  the  MBone.  Adjusting  the  ttl  can  assist  in  limiting  the  transmission  of  video  stream  data 
to  specific  regions  or  areas.  Consequently,  effective  controls  over  the  MBone  can  save 
precious  bandwidth  that  the  uncontrolled  transmitted  packets  might  otherwise  use. 
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In  order  to  make  the  MBone  community  a  viable  and  efficient  topology,  global 
coordination  is  used  to  minimize  congestion  on  the  Internet.  To  add  a  new  node  to  the 
MBone,  a  new  site  announces  itself  to  its  Internet  Service  Provider  (ISP)  or  the  MBone 
mailing  list.  Then,  the  nearest  network  providers  decide  on  the  most  advantageous  path 
connection  to  minimize  local  or  regional  Internet  traffic. 

MBone  uses  various  application  tools  in  order  for  end-users  to  receive  and  deliver 
videoconferencing.  The  common  applications  are  videoconference  tool  (vie),  visual  audio 
tool  (vat),  robust  audio  tool  (rat),  shared  whiteboard  (wb),  and  session  directory  (sdr).  Vat  is 
used  for  audio  teleconferences.  Shared  whiteboard  (wb),  using  T.120  protocols,  can  be  used 
as  a  shared  drawing  surface,  and  it  can  be  used  to  export  and  view  postscript  files.  The  sdr 
tool  dynamically  announces  the  availability  of  sessions  by  displaying  active  multicast 
groups.  Sdr  also  launches  multicast  applications  and  automatically  selects  unused  addresses 
for  any  new  groups.  Sdr  makes  announcements  periodically  over  a  well-known  multicast 
address  and  port. 

One  of  the  first  significant  uses  of  the  MBone  came  about  when  NASA  Select  set  up 
an  in-house  cable  channel  broadcast  during  space  shuttle  missions,  which  then  could  be 
viewed  live  fi-om  any  MBone  user’s  desktop  computer. 

Although  many  practical  applications  have  been  developed  on  the  MBone,.  it 
continues  to  be  used  as  a  testing  ground  for  IP  multicast  research  and  how  it  can  be 
leveraged  for  distance  learning.  One  thesis.  Internetworking:  Economical  Storage  and 
Retrieval  of  Digital  Audio  and  Video  for  Distance  Learning,  [Tiddy,  96],  investigates  the 
usefulness  and  feasibility  of  applying  networked  storage  of  digitized  video  and  audio,  all  via 
the  MBone  for  distance  learning.  Currently  there  are  prototypes  that  are  being  used  to 
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deliver  stored  digitized  data  over  the  MBone.  The  Interactive  Multimedia  Jukebox  Project, 
which  can  be  found  at  http://imj.gatech.edu,  is  a  research  effort  to  investigate  the  scalable 
delivery  of  video-on-demand  (VoD)  service  using  multicast  communication.  The  MBone 
VCR  on  Demand  Project,  at  http://www.informatik.uni-mannheim.de/informatik 
/pi/projects/MVoD/,  offers  a  solution  for  the  interactive  remote  recording  and  playback  of 
multicast  videoconferences. 


E.  MBone  ISSUES  IN  DISTANCE  LEARNING 

Because  it  was  originally  a  developmental  tool,  the  MBone  has  seen  limited  use  in 
the  commercial  environment,  but  it  has  already  proved  the  great  benefits  of  IP  multicasting. 
It  has  great  potential  to  grow  and  cover  the  entire  Internet.  Nevertheless,  many  network 
service  providers  have  not  enabled  multicasting  in  many  of  their  routers  for  various  reasons. 
Among  them  is  the  lack  of  maturity  of  the  technology,  not  being  sure  if  ATM  or  IP  (or 
combinations  of  both)  is  the  direction  to  take,  and  pricing  issues.  Many  regional  network 
service  providers  still  don’t  have  an  MBone  connection. 

MBone  is  not  easy  to  set  up.  Enabling  a  router  for  multicasting  and  installing 
MBone  tools  is  still  something  not  normally  done  by  network  administrators.  Many  are 
leery  about  how  video  services  will  impact  network  bandwidth.  Also,  MBone  tools  are 
mainly  developed  for  running  on  UNIX  machines,  and  there  are  still  problems  porting  the 
tools  to  Windows  machines.  Finally,  the  tools  aren’t  as  user  friendly  as  some  of  the 
commercial  products. 
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The  commercial  sector  is  discovering  the  viability  of  multicasting  and  is  starting  to 
develop  tools  that  are  based  upon  the  MBone  standards.  Companies  such  as  Whitepine’s 
CU-SEEME  and  Precept’s  IP/TV  already  have  MBone-compatible  applications. 
Videoconferencing  applications  will  continue  to  mature,  and  likely  the  myriad  of  standards 
will  eventually  converge.  This  process  can  be  accomplished  more  easily  if  the  newer 
products  are  based  upon  thoroughly  evaluated  tools. 


F.  SUMMARY 

This  chapter  discusses  the  major  multicasting  protocols,  technologies  and  issues  that 
are  pertinent  to  using  videoconferencing  as  a  part  of  distance  learning.  It  describes  the 
baseline  issues  that  need  to  be  addressed  in  order  to  multicast  distance  learning  lectures  to 
numerous  recipients  across  an  IP-based  network  to  sea.  These  proven  protocols  will  make 
videoconferencing  over  IP  networks  in  DoD  a  practical  solution.  One  primary  reason  is  (as 
opposed  to  dedicated  networks)  that  multicast  groups  can  be  dynamically  set  up  and  tom 
down.  This  flexibility  is  needed  because  of  the  constantly  changing  location  of  end-users 
such  as  those  receiving  distance  learning  at  sea. 

Standards  like  IP  multicasting,  and  the  future  implementation  of  IPv6,  will  address 
some  of  the  QoS  issues  by  supporting  resource  reservation  and  flow  setup.  Also,  as  older 
routers  are  replaced  or  upgraded  to  support  multicast,  videoconferencing  over  the  Internet 
and  NIPRnet  between  groups  at  numerous  locations  will  become  commonplace.  IPv6  is 
designed  to  help  improve  delivery  of  data  at  regular  intervals,  which  will  help  address 
Quality  of  Service  (QoS)  issues.  Its  packet  headers  will  help  define  the  types  of  service 
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(high  quality  paths  in  underlying  network)  that  can  be  used  for  real-time  delivery  of  audio 
and  video. 

This  chapter  has  also  shown  that  based  upon  the  thorough  testing  and 
implementation  of  multicasting,  it  is  clear  that  the  hurdles  currently  facing  BP  multicasting 
widespread  emplacement  is  deployment  the  rather  than  the  technology. 
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V.  IMPLEMENTING  IP  MULTICAST  ACROSS  THE  NAVAL 

NETWORK  ARCHITECTURE  TO  SEA 

A.  INTRODUCTION 

This  chapter  provides  an  analysis  of  numerous  options  that  can  be  used  to  leverage 
DISN  for  IP  multicast.  They  will  include  desktop  connectivity,  the  Unclassified  but 
Sensitive  Internet  Protocol  Router  Network  (NPRnet),  satellite  entry  points  (gateways), 
Defense  Satellite  Communications  System  (DSCS)  and/or  C  band  SHF  terminals  (Challenge 
Athena),  and  Automated  Digital  Networking  System  (ADNS). 


B.  BACKGROUND 

The  goal  of  the  Defense  Information  Infrastructure  (Dll)  is  to  establish  a  seamless, 
secure,  robust,  agile,  reliable  and  cost-effective  telecommunications  network  that  will  serve 
as  the  end-to-end  information  transfer  infrastructure  for  all  DoD  personnel  and  organizations 
worldwide  [DISA,  96].  The  Defense  Information  Systems  Network  (DISN)  architecture,  a 
component  of  the  Dll,  is  based  upon  a  global  network  integrating  existing  Defense 
Communications  Systems  assets.  Military  Satellite  Communications  (MILSATCOM), 
Commercial  SATCOM  initiatives,  leased  telecommunications  services,  dedicated  DoD 
Service  and  Defense  Agency  networks,  and  mobile/deployable  networks;  i.e.  the 
consolidated  worldwide  enterprise  level  telecommunications  infrastructure  that  provides  the 
end-to-end  information  transfer  component  of  the  DII  [DISA,  96]. 
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Through  the  Defense  Information  Systems  Agency  (DISA),  DoD  is  continuously 
identifying  what  architecture  and  standards  DISN  needs  for  a  telecommunications 
infrastructure  that  can  support  voice  and  video.  Currently,  the  Defense  Video  Service  - 
Global  (DVS-G),  the  transport  network  that  DISN  is  used  for  videoconferencing,  is  mostly  a 
collection  of  dedicated  room-based  systems  whose  terrestrial  components  are  connected  by 
ISDN  services.  One  other  segment  of  DISN  that  can  be  used  to  support  videoconferencing 
is  NIPRnet.  NIPRnet  is  an  IP-based  network  that  consists  of  the  wide-area  and  local-area 
network  switching  and  transmission  systems  along  with  customer  premises  equipment 
(CPE)  in  order  to  provide  connectivity  to  DoD  users. 


C.  DESKTOP  SYSTEMS  CONNECTIVITY 

1.  POTS 

Videoconferencing  applications  conducted  on  DISN  over  Ethernet,  token-ring,  or 
serial  modem  connections  are  straightforward.  Under  the  DISN  transmission  services 
CONUS  (DTS-G),  AT&T  provides  information  transport  for  the  aggregate  bandwidth  of  all 
customer  Service  Delivery  Points  homed  off  the  Bandwidth  Managers  located  in  their 
respective  access  areas.  Figure  5-1  is  a  diagram  of  the  CONUS  transmission  service.  To 
take  advantage  of  the  bulk  transmission  rates,  AT&T  bundles  the  access  transmission  into 
SONET  for  delivery  to  the  Bandwidth  Managers.  At  the  customer  access  locations, 
transmission  bandwidth  interfaces  at  Tl,  T3  and  SONET  are  provided.  AT&T  teams  with 
Local  Access  Providers  as  required  to  accomplish  the  access  area  bandwidth  requirements. 
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Figure  5-1  DISN  Architecture  [DISA,  96] 


For  POTS  connectivity,  commercial  and  DISN  networks  with  dedicated  dial  up 
connectivity  can  be  used.  However,  even  with  optimal  desktop  hardware  and  software, 
performance  is  always  a  question  due  to  throughput  problems  associated  with  modem 
connections  and  dirty  analog  lines,  which  can  cause  bit  errors  and  retransmissions. 

2.  Asynchronous  Digital  Subscriber  Line  (ADSL) 

A  twisted-pair  phone  line  has  a  capacity  far  beyond  the  narrow  3-kHz  channel  used 
to  carry  an  analog  voice  signal.  Historically  that  capacity  has  not  been  used  before,  because 
it  was  reserved  to  compensate  for  signal  loss  in  the  line.  A  reemerging  technology. 
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Asynchronous  Digital  Subscriber  Line  (ADSL)  overcomes  this  limitation  and  promises  to 
provide  download  data  rates  up  to  8Mbps  to  desktops,  while  transmission  rates  will  be  at 
least  ten  times  of  traditional  modem  data  rates.  ADSL  is  a  modem  technology  that  requires 
terminal  devices  at  each  end  of  the  phone  line  (user  to  Local  Exchange  Carrier  —  LEC). 
Because  of  the  high  frequencies  ADSL  uses,  the  distance  between  the  modem  and  the 
central  office  plays  a  significant  role  in  an  ADSL  modem  throughput.  The  closer  the 
modem  is  to  the  central  office,  the  less  signal  degradation  occurs.  For  example,  the 
maximum  distance  from  the  central  office  for  an  8Mbps  download  data  rate  would  be 
approximately  1.7  miles,  whereas  1.5  Mbps  has  a  3.4  mile  limit. 

Computer  industry  leaders  such  as  Compaq,  Intel,  Microsoft  and  phone  companies 
such  as  Ameritech,  Bell  Atlantic,  SBC  Communications,  US  West,  Sprint  and  GTE  have 
joined  in  an  alliance  to  promote  ADSL.  ADSL  technology  has  the  potential  to  further 
enhance  desktop  videoconferencing  by  removing  the  bottleneck  that  currently  plagues  many 
users  connected  via  standard  POTS.  Furthermore,  what  makes  ADSL  truly  attractive  is  that 
the  infrastructure  required  to  support  it,  twisted-pair  copper  phone  lines,  is  already  in  place. 
The  current  problems  with  ADSL  are  its  lack  of  availability  and  high  equipment  costs. 

3.  Cable  Modems 

Cable  Internet  access  is  a  relatively  new  transport  technology  that  is  still  in  its  early 
stage  of  rollout.  Except  for  the  past  year,  phone  companies  had  been  slow  implementing 
ADSL  in  their  central  offices,  which  was  a  favorable  situation  for  the  growth  and 
accessibility  of  cable  Internet  access.  At  the  end-point,  a  cable  modem  connects  to  the 
cable  television  coaxial  wiring  and  also  attaches  to  the  end-user’s  desktop  via  a  standard 
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Ethernet  connection.  Cable  modems  can  theoretically  deliver  data  at  up  to  350  times  that  of 
a  28.8  modem,  (i.e.  10Mbps).  Unlike  point-to-jwint  ADSL,  cable  modems  are  a  shared 
medium,  making  its  architecture  a  good  fit  for  multicasting.  Additionally,  end  users  will  not 
have  to  build  from  scratch  to  take  advantage  of  multicasting.  However,  because  cable 
modems  are  shared,  they  are  bound  to  run  into  congestion  problems  on  the  wire  as  users  fill 
up  local  cable  loops. 

Due  to  technical  limitations,  many  cable  Internet  services  do  not  allow  users  to  send 
data  via  the  cable  link.  Hybrid  systems,  in  which  incoming  data  comes  via  the  cable 
connection,  but  the  outgoing  data  travels  over  the  POTS  modem  connection  are  the  most 
common.  Therefore,  this  current  system  works  well  if  the  end-user  desires  to  receive 
videoconferencing  data,  but  it  is  not  a  good  set-up  for  delivering  videoconferencing  content 
from  the  desktop. 


D.  TERRESTRIAL  TRANSMISSION 

1.  Routing 

a.  Tunneling 

When  deciding  what  routing  protocol  is  most  effective  over  a  network,  one 
must  look  at  the  network  design  and  topologies.  While  the  NIPRnet  (as  a  whole)  is  not 
multicast  enabled,  the  Cisco  System  routers  used  throughout  the  NIPRnet  can  be  easily 
configured  to  support  multicast.  An  alternative  method  is  to  form  “tunnels”  between 
selected  multicast-enabled  routers  in  the  CONUS  segment.  Subnet  islands  can  be  created. 
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similar  to  what  is  used  in  the  MBone,  to  connect  various  end-users.  These  tunnels  can  be 
extended  to  gateways,  that  have  multicast-enabled  routers,  with  access  to  satellite  terminals 
in  order  to  provide  a  connection  to  remote  (deployed)  users.  Some  major  ISPs  (such  as 
UUNET)  are  using  tunneling  to  implement  IP  multicast  across  their  networks.  A  tunnel  is 
essentially  a  unicast  virtual  link  that  may  cross  several  bridges  and  routers,  which 
encapsulate  multicast  packets.  Tunnel  endpoints  can  be  either  routers  supporting  native 
multicast  routing  or  workstations  running  the  mrouted  multicast  daemon. 

The  advantages  of  tunneling  is  that  it  is  quick  and  easy  to  implement  and 
may  be  the  best  solution  when  both  the  number  of  customers  using  IP  multicast  and  the 
quantity  of  IP  multicast  traffic  is  limited.  Additionally,  tunneling  is  a  cost-effective  way  to 
gain  the  benefits  of  multicast  without  adding  excessive  risks  or  making  mass  hardware 
changes.  However,  there  are  two  major  disadvantages.  The  first  disadvantage  is  setting  up 
and  managing  multicast  servers  or  gateways.  The  second  is  that  tunneling  inserts  the 
process  of  encapsulating  IP  Multicast  datagrams  into  unicast  IP  datagrams,  essentially 
slowing  down  the  transmission  and  introducing  scaling  problems  [Hurwicz,  97]. 

b.  PIAf-SM 

As  mentioned  in  Chapter  IV,  Sparse-Mode  protocols  are  based  upon  the 
assumption  that  the  multicast  group  members  are  sparsely  distributed  throughout  the 
network  and  bandwidth  is  not  necessarily  widely  available.  It  addresses  the  need  for  a 
scalable  wide-area,  inter-domain,  multicast  routing  mechanism  in  a  large  network 
infrastmcture,  such  as  NIPRnet.  PIM-SM  is  available  in  Cisco  System’s  routers  (which 
comprise  most  of  the  routers  used  on  the  NIPRnet).  PIM-SM  solves  the  routing  table 
problem,  found  in  DVMRP,  by  using  the  unicast  tables  for  multicasting  [Hurwicz,  97],  but 
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there  are  still  some  drawbacks.  Because  unicast  routes  adjust  automatically  to  equipment  or 
link  failures,  if  there  are  specific  routes  that  multicast  traffic  should  or  must  take,  there  is  no 
guarantee  that  it  will  take  that  route.  If  all  routers  are  not  multicast  enabled  (which  is 
highly  likely)  data  may  be  lost. 

NASA  addressed  this  problem  on  its  NASA  Research  and  Education 
Network  (NREN)  by  moving  the  responsibility  for  the  multicast  network  to  the  same  groups 
that  were  managing  the  unicast  network.  Since  the  hardware  usually  has  a  decisive 
influence  on  the  choice  of  multicast  routing  protocol,  NASA  uses  PIM  in  the  Cisco-based 
portions  of  the  network,  and  MOSPF  on  the  Proteon  router  portion,  since  they  are  oriented 
towards  MOSPF  [Hurwicz,  97]. 

Since  distance  learning  via  videoconferencing  in  the  Navy  will  require 
data  to  be  transmitted  worldwide,  PIM-SM  should  be  seriously  considered  as  a  routing 
protocol  in  NIPRnet  routers  used  for  multicasting. 

2.  IP  over  ATM 

The  NIPRnet  has  a  10-node  ATM  backbone  in  the  Continental  United  States  that  is 
connected  via  SONET  OC-12  (622Mbps)  pipes.  The  ATM  switches  provide  switched 
(SVC)  or  permanent  virtual  circuits  (PVC),  and  has  promised  to  handle  the  QoS  issues  that 
IP  multicast  traditionally  did  not  address.  Therefore,  instead  of  the  IP  datagrams  being 
routed  across  the  long-haul  pipes,  they  will  jump  to  the  ATM  backbone  and  exit  at  a 
NIPRnet  router  closest  to  the  destination. 
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Although  it  has  been  proven  that  ATM  has  the  ability  to  scale  under  high  traffic 
loads,  one  major  problem  with  transporting  IP  over  ATM  is  that  the  IP  datagrams  have  to  be 
mapped  to  ATM  protocols  before  it  goes  over  the  ATM  backbone,  and  then  converted  back. 

Not  converting  IP  datagrams  to  ATM  cells  eliminate  three  potential  problems.  First, 
IP-to-ATM  protocols  such  as  MPOA  are  complicated,  and  ATM  is  still  unfamiliar  to  many 
network  managers.  Second,  standards  for  the  protocols  to  map  IP  to  ATM  are  still  not 
officially  set,  although  they  are  close  to  being  finalized.  Finally,  if  the  challenge  is  how  to 
push  more  IP  traffic  across  the  data-oriented  Internet,  you  can  ignore  all  of  the  other  things 
ATM  is  supposed  to  do  (such  as  voice)  and  use  ATM’s  fast  hardware  for  switching  IP 
traffic  [Dutcher,  97].  Therefore,  finding  economical  ways  of  trafficking  IP  datagrams 
across  ATM  network  backbones  can  be  a  plus  for  IP  based  videoconferencing  applications. 

IP  over  an  ATM  network  combines  layer  3  scalability  and  flexibility  with  layer  2  . 
switching  and  high  performance,  essentially  amounting  to  VC’s  across  a  TCP/IP  network, 
that  can  stream  data  at  high  speeds.  Through  the  development  of  layer  3  routing  in 
switches,  two  popular  methods  have  emerged,  IP  switching  and  Tag  switching. 

a.  IF  Switching 

Developed  by  Ipsilon  Networks,  IP  switching  software  creates  IP  ability  in 
ATM  switches.  The  idea  is  to  establish  a  path  across  a  network.  If  a  network  of  IP  switches 
set  up  a  “switched”  virtual  circuit  (VC)  among  themselves  across  a  network,  they  can 
improve  traditional  IP  routing.  The  ATM  switch  acts  as  a  router  for  low-duration  traffic  and 
as  an  ATM  switch  for  long-duration  flows.  It  is  designed  to  allow  network  administrators 
determine  how  long  a  flow  should  be  in  order  to  activate  switching  instead  of  IP  routing. 
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NASA  is  currently  conducting  studies  on  the  use  of  IP  switching.  Its 


simulation  studies  have  shown  eighty-four  percent  of  data  packets  can  be  IP  routed 
[Breeden,  97]. 


b.  Tag  Switching 

Tag  Switching  software  is  developed  by  Cisco  Systems.  Working  with  ATM 
networks,  the  software  tags,  or  maps,  the  current  network  and  stores  the  data  in  routers.  The 
data  packets  are  tagged  and  switched  as  they  leave  their  starting  points  (in  this  case 
Bandwidth  Management  Centers).  The  tags  can  use  the  Last-in-First-Out  (LIFO)  method  at 
the  switch  based  upon  its  priority  designation.  The  tags  allow  the  network  to  plot  a  course 
through  the  ATM  backbone  portion.  The  ATM  switches  scan  the  tag  and  then  send  it  to  the 
next  switch.  A  tag  can  be  an  aggregate  of  tags,  allowing  an  iterative  process  that  increases 
the  scalability  of  the  network.  Unlike  routers,  the  switches  will  need  to  know  the  complete 
path  to  the  edge  router  destination. 

One  drawback  is  that  tag  switching  only  works  with  Cisco  equipment.  Since 
the  vast  majority  of  routers  used  in  NIPRnet  are  Cisco  routers,  there  will  be  no  need  for 
major  hardware  procurement  to  utilize  this  method. 
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ATM  Considerations 


c. 

If  either  of  these  two  aforementioned  methods  is  used  over  NIPRnet’s  ATM 
backbone,  native  routing  will  essentially  be  pushed  to  the  periphery  of  the  network,  allowing 
IP  switching  or  Tag  switching  to  handle  the  backbone  segment.  Each  method  advertises  the 
ability  to  provide  almost  the  same  bandwidth  as  ATM  without  having  to  add  an  extra  layer 
of  conversion  to  already  time  critical  data.  Also,  IP  over  ATM  may  not  only  provide 
significant  savings  in  architecture  changes,  but  might  also  alleviate  the  need  for  customers 
being  forced  to  implement  ATM  to  the  desktop,  requiring  even  more  spending.  One 
potential  problem  with  these  two  methods  of  IP  over  ATM  is  that  they  are  still  under 
development  and  have  not  proven  their  ability  to  scale  under  heavy  network  loads. 
Furthermore,  most  videoconferencing  applications  are  already  devoted  mostly  to  IP. 


E.  VIDECONFERENCING  OVER  DISN’s  SATELLITE  SYSTEMS 

A  deployed  unit’s  means  of  transporting  videoconferencing  over  DISN  (i.e. 
NIPRnet)  will  be  by  using  mihtary  and  commercial  SATCOM  (C-band  and  Ku-band),  Ultra 
High  Frequency  (UHF)  and  Super  High  Frequency  (SHF)  SATCOM,  MILSTAR  Extremely 
High  Frequency  (EHF)  Medium  Data  Rate  (MDR),  DSCS  (military),  and/or  C  band  SHF 
terminals  (Challenge  Athena)  into  an  entry  point  or  DISN  gateway.  To  provide  a  gateway 
to  the  terrestrial  segments  of  DISN,  this  integrated  satellite  transmission  system  will  be 
further  interconnected  with  the  services  of  the  Standardized  Tactical  Entry  Point  (STEP). 
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1.  Space  Segment 


The  space  segment  is  composed  of  Ultra  High  Frequency  (UHF)  SATCOM,  DSCS 
n/in  multi-channel  SHF  75bps  -  1.5Mbps(T-l),  MILSTAR  Extremely  High  Frequency 
(EHF)  Medium  Data  Rate  (MDR)  for  medium  data  rate  —  4.8Kbps  -  1.544Mbps, 
conunercial  SATCOM  (L,C,and  Ku  bands)  -  2.4Kbps  -  8.448Mbps,  and  the  Global 
Broadcast  System  (GBS),  which  is  currently  being  readied. 

Since  satellites  are  inherently  broadcast  by  nature,  an  implementation  of  a  typical 
satellite  link  requiring  satellite  terminals  and  military  or  commercial  satellite  resources  fits 
well  within  the  IP  multicast  basic  model. 

Deployed  units'  entry  point  accesses  are  currently  supported  primarily  at  Navy 
SATCOM  facilities,  which  serve  three  of  the  four  NCTAMS.  Navy  access  to  non- 
NCTAMS  sites  requires  circuits  to  be  terrestrially  back-hauled  to  the  nearest  NCTAMS  site. 
Navy  access  procedures  to  terminal  segments  are  described  in  Naval  Telecommunications 
Publication  (NTP)-4,  NTP-2,  and  Communications  Information  Bulletins  (CIBs). 


2.  Terminal  Segment 

Cormectivity  with  shore  communities  can  be  leveraged  using  the  Standard 
Tactical  Data  Entry  Points  (STEP).  STEP  is  a  Joint  Staff  directed  upgrade  to  the  DSCS 
portion  of  the  Digital  Communications  Satellite  Subsystem  (DCSS)  program,  which  is 
designed  to  improve  and  standardize  Navy  Tactical  Satellite  Communications  (SATCOM). 
Fourteen  DSCS  sites  will  eventually  be  upgraded  worldwide  to  provide  access  to  DISN. 
STEP  sites  provide  both  ship-to-shore  and  ship-to-ship  communications  consisting  of 
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operational  and  administrative  traffic.  These  sites  could  be  either  single  or  dual,  whereas  a 
single  STEP  site  supports  one  satellite  coverage  area  while  a  dual  STEP  site  supports  at  least 
two  satellite  areas.  These  gateways  can  allow  at-sea  units  to  quickly  connect  to  the  DISN 
sustaining  base  services  that  they  need  for  videoconferencing  data.  Under  the  ITSDN 
Program,  NIPRNET  routers  are  installed  at  the  STEP  sites,  with  a  512Kbps-transmission 
path  provided  firom  the  STEP  site  ITSDN  router  to  the  NIPRNET  backbone.  One  drawback 
is  that  tactical  access  to  ITSDN  is  provided  only  on  a  temporary  basis  and  may  require 
CINC  approval.  The  ITSDN  IP  router  address  assignments  for  tactical  units  are  obtained 
and  provided  by  the  user. 


3.  Network  Cache  at  the  Gateways 

Because  the  ship/shore  gateway  is  a  component  of  the  paths  of  many 
videoconferencing  sessions  travelling  across  the  NIPRnet,  storing  sessions  on  a  cache  server 
offers  a  potentially  significant  savings  in  bandwidth  and  end  user  latency  by  allowing  end- 
users  to  retrieve  data  at  the  gateway,  rather  than  having  to  reach-back  to  the  original  source. 

Network  caching  can  be  used  to  deliver  to  sea  video/audio  from  large  disk  caches  at 
various  gateways,  while  saving  needed  for  bandwidth  across  the  NIPRnet’s  territorial 
backbone  network.  Therefore,  if  a  student  were  not  able  to  receive  a  videoconferencing 
session  real-time,  he  or  she  might  download  a  session  from  a  network  cache  server,  where 
the  stored  (recorded)  session  would  be.  Since  personnel  will  be  enrolled  in  a  variety  of 
courses,  it  must  be  assumed  that  all  units  will  not  be  downloading  the  same  information. 
Therefore  this  type  of  flexibility  would  require  a  very  large  disk  cache  to  store  information. 
In  order  manage  the  resources,  a  certain  amount  of  digital  storage  space  would  need  to  be 
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allocated  for  each  course  on  the  cache  server,  and  also  it  must  be  decided  how  long  to  leave 
a  videoconferencing  session  “forward  stored”  on  the  server.  For  example,  if  a  typical  video 
and  voice  data  stream,  transmitted  to  the  network  cache  at  300Kbps  (near  the  upper 
transmission  end  of  VIXS),  were  fifty  minutes  long,  the  storage  space  required  for  the 
lecture  would  be  approximately  1 1  IMbytes.  Table  5-1  shows  the  estimated  storage  space. 

300Kbps  stream  *  1  Byte/8  bits  =  37.5KBps 
37.5KBps  *  3600  seconds/hour  =  135MB/hour 
135MB/hour*.825hours  =  11 1.375MB  required  per  lecture 
Table  5-1:  Estimated  Digital  Storage  Requirements 

If  each  course  stored  one  week  of  lectures  on  a  5GB  disk  drive,  leaving  storage 
space  for  system  operation,  over  40  lectures  can  be  stored  on  just  that  one  drive.  With 
digital  storage  expected  to  cost  about  .02  cents  per  MB  by  1998^,  cost  for  storage  is 
minimal. 

Network  cache  systems  could  be  used  with  the  Global  Broadcast  System  (GBS)  to 
broadcast  videoconferencing  data  to  users.  The  GBS  space-segment  is  a  Ka-Band 
communications  payload  carried  aboard  U.S.  Navy  UHF  Follow-On  (UFO)  satellites.  By 
providing  reliable  multicast  transport  data  protocols  with  GBS,  users  can  download 
videoconferencing  sessions  from  a  gateway,  and  store  data  locally  for  future  use.  User 
requests  can  be  made  by  a  slower  back  channel.  In  order  to  manage  bandwidth  over  the 
space  segment,  each  unit  can  be  given  a  download-time  window,  or  the  network  cache  can 
be  controlled  to  deliver  content  to  sea  only  during  non-peak  hours. 


^  Survey  taken  by  consulting  firm  Disk/Trend  Inc.  of  Mountain  View,  CA. 
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F.  SHIPBOARD 


1.  ADNS 

Because  many  shipboard  networks  are  not  interoperable  and  require  some  type  of 
gateway  to  interface  with  other  systems,  SPAWARSYSCOM  has  developed  the  Automated 
Digital  Network  System  (ADNS)  within  the  Joint  Maritime  Communications  Strategy 
(JMCOMS).  ADNS  is  attempting  to  convert  the  Navy  stovepipe  systems  into  network- 
compatible  systems  without  incurring  the  cost  to  completely  redesign  and  procure  new 
systems  for  delivery  data  to  afloat  forces  [Bergdahl,  96]. 

Currently  the  bandwidth  of  ADNS  cannot  support  real-time  videoconferencing,  but 
as  it  improves  bandwidth  capacity,  ADNS’s  routing  and  switching  system  will  provide  the 
interface  to  end-user  video  and  voice  data  across  available  RF  media.  The  routing  and 
switching  subsystem  should  include  an  IP  router  and  a  suite  of  common  multicast  routing 
protocols.  The  routers  should  also  support  QoS  protocols,  such  as  RSVP.  In  order  to 
prevent  multicast  packets  from  wasting  unnecessary  bandwidth  on  the  shipboard  LAN, 
multicast  filtering  switches  might  be  used.  IP  multicast-enabled  switches  automatically  set 
up  filters  so  multicast  traffic  is  only  directed  to  participating  end-nodes. 


G.  CONCLUSION 

As  shown  in  the  chapter,  the  network  infrastructure  and  technology  is  available  to 
deliver  P  multicast  to  sea.  If  used,  delivering  videoconferencing  over  DISN’s  P-based 
networks  can  alleviate  the  need  for  dedicated  systems  that  require  people  to  travel  anyway. 
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Because  the  architecture  and  management  systems  are  already  in  place,  using  IP-based 
networks  can  provide  distance  learning  to  a  broad  audience  with  minimal  spending. 
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VI.  VIDEOCONFERENCING  APPLICATIONS 


A.  INTRODUCTION 

This  chapter  discusses  typical  videoconferencing  software  and  hardware  that  can  be 
used  to  deliver  distance  learning  via  videoconferencing  from  a  desktop  computer  over  an  IP- 
based  network.  This  chapter  does  not  endorse  any  particular  software  application(s),  but  is 
merely  providing  some  examples  of  common  tools  currently  available.  This  chapter  also 
provides  the  recommended  standards  when  employing  desktop  videoconferencing. 


B.  VIDEOCONFERENCING  APPLICATIONS 

Although  most  of  the  newer  routers  and  switches  are  configured  to  support  IP 
multicast,  many  of  them  are,  by  default,  not  enabled.  Also,  many  current  software 
applications  are  unicast  and  must  also  be  modified  to  interface  with  the  niulticasting 
capabilities  of  TCP/IP  stacks,  which  in  turn,  join  and  leave  multicast  groups  by  using  IGMP 
[Hurwicz,  97].  Because  companies  realize  that  there  is  great  potential  in 
videoconferencing,  these  issues  have  not  inhibited  application  developers  from  eagerly 
creating  new  products.^ 

Bandwidth  and  picture  quality  is  still  a  major  impediment,  but  other  barriers  like 
standardization,  costs,  and  installation  costs  continue  to  decrease.  Microsoft  has  embedded 
its  collaboration  tool,  NetMeeting,  in  its  free  Internet  Explorer  4.0  browser.  Netscape 

*  According  to  Multimedia  Research  Group,  Inc.  of  Sunnyvale,  CA,  and  Fuji  Keizai  USA,  approximately 
4,000  Web  sites  offered  video  clips  in  1996.  That  number  tripled  to  12,000  in  1997  and  is  expected  to  triple 
each  year  for  at  least  the  next  three  years. 
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Communicator  4.0,  which  is  also  free,  packages  an  analogous  tool,  Netscape  Conference. 
Microsoft  also  has  released  a  UNIX  version  of  Internet  Explorer  4.0.  The  MBone  also  used 
uses  free  videoconferencing  desktop  applications  (vie,  vat,  sdr,  wb,)  that  are  proven  and 
reliable.  Unfortunately,  many  of  the  commercial  desktop  applications,  which  are  PC  based, 
are  not  fully  compatible  with  the  MBone  tools,  which  are  mostly  UNIX  based. 

Delivering  synchronous/asynchronous  video  and  audio  streams  to  sea  not  only 
requires  a  network  architecture,  but  it  also  requires  software  tools  that  are  capable  of 
providing  quality  content  to  the  student.  Even  so,  quality  content  delivery  does  not  replace 
the  need  for  occasional  student/instructor  collaboration.  Today’s  desktop  videoconferencing 
tools  can  generally  be  broken  down  into  two  categories.  First  are  standards-based 
collaboration  applications,  which  provide  complete  information-sharing  solutions  that  span 
the  spectmm  from  one-to-one  to  fully  interactive  meetings.  Secondly,  there  are  streaming 
applications  that  broadly  distribute  one-way,  live  or  stored  presentations.  Desktop 
collaborating  applications  enable  users  to  communicate  with  a  small  number  of  others,  such 
as  for  desktop  videoconferencing.  Streaming  applications  are  much  more  scalable,  making 
it  possible  to  reach  a  virtually  unlimited  audience. 

Streaming  applications  will  generally  have  both  client  and  server  software,  whereas 
collaboration  applications  can  be  client-to-client.  To  inititalize  multipoint  sessions, 
collaborative  application  users  register  their  contact  information  with  a  location  server. 
Fourll  and  Microsoft’s  Internet  Location  Server  (ILS)  are  two  examples.  These  servers  are 
based  upon  Lightweight  Directory  Access  Protocol  (LDAP). 

Because  audio  is  the  most  critical  and  sensitive  aspect  of  videoconferencing, 
applications  should  provide  features  that  allow  audio  adjustments  to  compensate  for  non- 
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guaranteed  bandwidth.  Applications  must  support  different  audio  codecs  in  order  to  allocate 
certain  amounts  of  the  data  stream  for  different  bandwidths.  Chat  room  software  can  be 
used  as  an  option  when  voice  and  video  are  bandwidth  constrained.  The  ability  to  tune 
audio  during  transmission,  and  embedded  Forward  Error  Correction  (FEC)  or  redundancy 
schemes,  used  in  CU-SeeME  and  the  MBone’s  rat  tool,  can  help  minimize  poor  audio 
reception. 

Desktop  videoconferencing  collaboration  applications  also  need  a  combination  of 
document  management  capabilities,  such  as  file  sharing,  white  board,  and  snapshot  tools, 
which  allow  users  to  capture  whole  windows  or  parts  of  windows  for  cutting  and  pasting  to 
the  whiteboard.  Standard  e-mail  applications  can  be  used  for  administrative  purposes,  such 
as  setting  up  time  for  point-to-point  conferencing  when  additional  help  is  required. 

Multicasting  videoconferencing  applications  use  basically  a  straightforward 
extension  to  BSD  4.3  Berkley  Socket  API,  which  is  supported  by  operating  systems  such  as 
UNIX,  and  Windows  95  and  NT.  As  these  API’s  become  cross-platform  capable,  and  more 
readily  supported  by  Winsock  2,  they  will  be  ready  for  widespread  use  on  PC’s,  running 
OS’s  such  as  Windows. 


C.  RECOMMENDED  STANDARDS 

H.323  and  H.324,  T.120,  along  with  multicast  protocols,  such  as  IGMP,  RTP,  RTCP, 
and  RSVP  make  up  the  primary  standards  for  desktop  videoconferencing  systems.  As  an 
extension  of  H.320,  H.323  addresses  multipoint  videoconferencing  over  ISDN,  POTS,  as 
well  as  LANs  and  the  Internet.  H.324  is  the  standard  for  real-time  multimedia  standards 
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over  POTS.  When  using  application  from  different  vendors,  ensure  that  each  completely 
implements  the  standards  it  claims.  For  example  Microsoft  NetMeeting  and  Netscape 
Conference  are  both  "H.323  compliant,"  but  they  do  not  have  any  common  audio  codecs, 
rendering  them  unable  to  talk  with  each  other.  Even  with  these  misinterpretations,  the 
standards-based  support  and  deploying  an  application  base  required  for  most  desktop 
videoconferencing  is  no  longer  an  inhibitor.  As  in  the  past,  network  bandwidth  and 
interoperability  across  different  platforms  are  still  the  major  problems.  Dial-up  with 
modems  over  POTS  still  continues  to  be  a  choke  point  for  delivering  and  receiving 
videoconferencing.  As  H.324  matures,  manufacturers  will  begin  to  build  more  H.324 
compliant  chip  sets  into  hardware.  As  of  now,  H.324  is  acceptable  for  point-to-point 
collaboration,  but  not  for  supporting  IP  multicast. 

Although  the  ITU-T  has  provided  the  baseline  codec  standards  for 
videoconferencing  there  are  several  de  facto  standards  that  have  emerged.  Microsoft  Video 
for  Windows  zmd  Apple  QuickTime  are  common  video  codecs.  QuickTime  is  compatible 
with  both  Windows  and  Macintosh  environments  and  has  been  accepted  by  ITU-T  as  the 
basis  for  MPEG-4.  The  use  of  hardware  codecs  can  alleviate  some  of  the  CPU  usage,  but 
today’s  multimedia  capable  processors  are  more  than  capable. 

One  of  the  first  companies  to  market  a  product  fully  based  upon  IETF  standards  that 
relate  to  real-time  video  and  audio  streams,  and  ITU-T  standards  for  data  compression  and 
decompression  was  Precept  Software.  Its  Flashware  Server  software  and  IP/TV  viewer 
client  were  initially  available  for  Wintel  based  systems.  Because  of  the  implementation 
nonproprietary  standards,  this  product  can  receive  MBone  group  sessions,  giving  it  the 
capability  to  interoperate  with  UNIX  platforms.  Until  more  companies  adopt  universal 
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standards,  this  is  one  of  the  few  options  for  cross-platform  capability  between  UNIX  and  PC 
users. 
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Table  6-1  describes  the  minimum  standards  needed  for  videoconferencing  systems. 


Whiteboard 


Additional 

Features 


Multicasting 


Controls 


Additional 

Support 


G.71 1,0.7.22, 

G.728, 

Full-Duplex 


T.120,JPEG,GIF 
THT,  Postscript,  Still 
Frame  Capture,  File 
Transfer 


Chat  Functions, 
Application  Sharing 


G.723 

Full-Duplex 


T.120,  JPEG,  GIF, 
TIFF,  Postscript, 
Still  Frame  Capture, 
File  Transfer 


Chat  Functions, 
Application  Sharing 


RTP,  RTCP,  (RSVP, 
RTSP  when  adopted) 
Multiple  Simultaneous 
Sessions 


BW  Controls(Frame- 
Rate,  Image  Size) 


BW  Controls 


Trial  Copies  for 
testing 


Firewall 
Configuration, 
Trial  Copies  for 
testin 


MOSPF 

PM 


Table  6-1:  Videoconferencing  Standards  over  IP  Networks 
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a  HARDWARE 


Today’s  desktop  computers  provide  most  of  the  hardware  components  needed  for 
videoconferencing.  A  good  camera  and  video  capture  card,  which  can  cost  as  little  as  $200, 
is  all  of  the  upgrading  that  is  normally  required.  This  is  a  markedly  low  price  in  comparison 
to  roll-about  and  room-based  systems.  The  release  of  its  newer,  faster  multimedia  based 
processors  is  sealing  the  fate  of  expensive  hardware  codecs.  This  is  the  recommended 
desktop  system  hardware  requirement  to  support  desktop  videoconferencing: 

•  Desktop  w/  processor  that  supports  multimedia 

•  Digital  camera  for  face  view^ 

•  Microphone 

•  Speakers  and/or  headphones 

•  16  bit  sound  card,  (full-duplex) 

•  Video  Card 

•  Video  capture  card^ 

•  Web  Server^ 

•  Minimum  28.8Kbps  Modem 


^  Cameras  that  connect  to  video  capture  boards  are  recommended.  Parallel  port  cameras  place  requires 
excessive  CPU  cycle  time  (for  lesser  powerful  CPU’s-  less  than  Pentium  133,  use  an  Analog  Camera). 

Video  capture  cards  may  include  onboard  codecs,  but  as  processor  power  has  increased,  these  more 
expensive  boards  are  unnecessary.  This  recommendation  is  based  upon  a  face-to-face  conference.  If  a  server  is 
the  capturing  device  it  will  use  a  video  capture  board. 

^  For  Streaming  Video  Applications  over  Intemet/Intranets 
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Cameras  that  connect  to  video  capture  boards  are  recommended.  Parallel  cameras 
are  unacceptable  because  of  inadequate  data  throughput,  and  because  they  require  excessive 
CPU  cycle  time. 


E.  SUMMARY 

The  rrU-T  and  IETF  standards  will  likely  gain  broad  acceptance  since  they  are 
based  upon  videoconferencing  over  the  commonly  existing  network  architectures.  In  order 
for  videoconferencing  to  gain  full  acceptance,  H.320,  H.323  and  H.324  must  work  together 
integrated  applications. 

Although  desktop  videoconferencing  is  becoming  more  capable,  the  frame  rates  and 
and  small  picture  size  of  streaming  videoconferencing  applications  are  still  lacking.  If  used 
in  conjunction  with  collaborative  software  such  as  whiteboards,  shared  application  and 
shared  control,  there  is  adequate  functionality  to  conduct  meaningful  learning. 
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VII.  VIDEOCONFERENCING  DEMONSTRATION 


A.  INTRODUCTION 

This  chapter  provides  a  proof  of  concept  that  demonstrates  how  current 
videoconferencing  software  can  be  used  to  deliver  synchronous  or  asynchronous  material 
for  distance  learning  over  an  IP  based  network  via  multicast.  The  demonstration  is  follow- 
on  work  accomplished  in  Internetworking:  Economical  Storage  of  and  Retrieval  of  Digital 
Audio  and  Video  for  Distance  Learning  [Tiddy,  95]  and  Internetworking:  Worldwide 
Multicasting  of  the  Hamming  Lectures  for  Distance  Learning  [Emswiler,  95]. 


B.  OVERVIEW 

Several  free  software  tools  were  considered,  and  the  one  selected  was  the  MBone 
VCR  on  Demand  (MVoD),  developed  by  Wieland  Holfelder  at  the  University  of  Mannheim, 
Germany.  The  MVoD  is  a  free,  experimental  software  solution  for  the  interactive  remote 
recording  and  playback  of  multicast  videoconferences.  The  MVoD  Service  offers  a 
graphical  user  interface  (GUI)  environment  where  the  user  can  interactively  record 
audio/video  conferences  on  a  remote  server,  controlling  the  recording  session  with  a  local 
client  application.  Later,  that  same  user  or  other  users  can  play  the  session  back  on  demand, 
via  multicast  or  unicast. 

Through  the  use  of  this  tool,  the  goals  of  this  experiment  was  to  demonstrate: 

•  A  successful  download  and  installation  the  MVoD  Service  software. 
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•  Multicasting  a  prerecorded  taped  lecture  over  the  MBone  via  an  SGI  workstation 
while  recording  the  multicast  lecture  using  the  MVoD  Service  from  a  second 
workstation  that  has  the  MVoD  Service  software  installed. 

•  Use  the  MVoD  Service  to  playback  and  multicast  a  satisfactorily  replicated 
session  over  the  MBone,  which  can  be  received  by  multiple  users. 


C  DEMONSTRATION 

To  begin  the  testing,  the  MVoD  Service  software  was  downloaded  from  the  site 
http :/Avww. infoTTnatik.uni-mannheim.de/informatik/pi4/projects/MVoD.  Version  0.9a7  of 
the  software  was  installed  on  a  Silicon  Graphic  Indy,  running  IRIX  6.2  OS,  128  MB  of 
RAM,  running  a  MIPS  RIOOO  processor.  The  MBone  tools  sdr,  vie  and  vat  were  already 
installed.  The  MVoD  architecture  consists  of  three  components: 

•  The  MVoD  Server:  handles  the  user  and  session  management 

•  The  MVoD  Client:  offers  the  users  a  GUI  to  access  the  MVoD  Service 

•  The  RTP  DataPump:  is  responsible  for  the  recording  and  playback,  the 
synchronization  and  the  administration  of  the  RTP  data  streams. 

A  number  of  internal  protocols  have  been  developed  to  provide  communication 
between  the  various  MvoD  software  components.  They  include  the: 

•  VCR  Announcement  Protocol  (VCRAP)-  the  server  announces  its  services  to  all 
clients. 

•  VCR  Service  Announcement  Protocol  (VCRSAP)-  the  clients  have  access  to  the 
server. 

•  VCR  Stream  Control  Protocol  (VCRSCP)-  the  client  use  to  access  and  control  a 
session  on  the  server. 
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•  RTP  DataPump  Control  Protocol  (RDCP)-  the  server  uses  to  control  the  RTP 
DataPumps  (one  per  session). 

An  interface  has  also  been  implemented  with  the  Session  Announcement  Protocol 
[Perkins,  97],  which  is  used  by  the  MBone  tool  sdr,  in  order  for  the  MVoD  server  to  learn 
about  ongoing  MBone  sessions.  Figure  7-1  is  the  MVoD  architecture  with  its  various 
protocols,  which  are  used  in  conjunction  with  MBone  tools.  Detailed  explanations  of  the 
various  protocols  can  be  found  at  the  web  site. 


Figure  7-1  MVoD  Architecture  [Holfelder,  97] 


The  testing  was  accomplished  using  two  SGI  workstations  (Indy  and  Octane  models) 
on  the  NPS  LAN.  The  test  lectures  for  the  multicast  transmission,  which  had  been 
developed  from  the  thesis  "Internetworking  Worldwide  Multicast  of  the  Hamming  Lectures 
for  Distance  Learning”  (Emswiler,  95),  were  input  to  an  SGI  Indy  workstation  (blacknoise) 
from  the  line  output  of  a  VCR  .  The  MVoD  Service  was  running  on  an  adjacent 
workstation  (electric).  The  MVoD  Service,  and  the  MBone’ s  sdr,  vat  and  vie  were  the 
software  used  for  the  experiment.  The  MBone  tools  are  also  free  and  can  be  downloaded 
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form  many  ftp  sites  that  provide  MBone  tools.  The  MBone  tools  used  have  already  been 
proven  effective,  therefore  the  focus  of  the  Chapter  will  be  on  the  effectiveness  of  the 
MVoD  recording  and  playback  processes. 


D,  RECORDING  A  BROADCAST 

The  first  step  was  to  set  up  the  workstation  expected  to  multicast  the  lecture  over  the 
MBone  by  providing  video  and  audio  line  connections  from  the  VCR.  The  sdr  tool  on 
blacknoise  was  used  to  create  a  new  MBone  session.  The  video  and  audio  source  of  the 
MBone  transmission  was  provided  by  a  VCR  that  played  back  the  Hanouning  Lecture  Series. 
Once  the  VCR  weis  connected  and  the  session  created,  the  lecture  was  multicast  over  the 
MBone  using  vie  and  vat  (RTPv2).  For  the  multicast,  default  bandwidth  settings  for  vie 
H.261  (128Kbps)  and  vat  PCM  audio  (64Kbps)  were  used.  The  time-to-live  (ttl)  was  set  to 
15,  in  order  to  keep  the  transmission  restricted  to  the  campus  LAN. 

On  the  workstation  “electric,”  the  MVoD  Service  was  running.  The  MVoD  client 
GUI  was  used  to  control  the  MVoD  server  and  RTP  data  pump.  The  May  26*  1995  lecture 
was  recorded  by  MvoD  using  a  128kbps  (maximum)  vie  video  stream,  which  lasted  for  37 
minutes.  After  the  session  was  recorded,  the  file  size  of  the  recording  was  noted.  Based 
upon  the  five  files  that  the  MVoD  server  creates  for  each  recorded  session,  the  total  file  size 
for  the  transmission  was  approximately  46  MBs.  Therefore  the  recording  averaged  1.24MB 
per  minute  of  data  stored.  If  a  standard  50-minute  lecture  were  held,  the  storage 
requirements  would  be  approximately  62  MBs.  An  expected  file  size  is  thus  approximately 
75MB  per  hour.  (This  size  fits  conveniently  inside  of  a  100MB  zip  disk). 
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E.  PLAYING  BACK  AN  MBone  RECORDED  SESSION 


The  next  step  in  evaluating  te  MVoD  Service  was  to  play  back  (multicast)  the 
recorded  session  while  simultaneously  transmitting  it  over  the  MBone.  Using  the  MVoD 
client  GUI  on  electric,  a  list  of  sessions  previously  recorded  (which  was  only  one,  in  our 
case)  by  the  server  was  displayed.  Once  the  session  was  selected,  the  GUI  also  provided 
the  option  of  playing  back  either  audio,  video  or  both  mediums.  Both  audio  and  video  were 
selected.  When  the  play  button  was  clicked,  the  session  was  multicast  over  the  MBone  and 
vie  and  vat  were  automatically  launched  locally  in  order  for  the  person  playing  back  the 
session  to  observe  it.  The  transmission  used  the  same  bandwidth  settings  that  were  used 
during  the  original  session  and  can  not  be  changed. 

The  rebroadcast  (play  back)  of  the  session  was  observed  using  vie  and  vat  tools  on 
“electric”  and  “blacknoise.”  From  the  observation,  there  was  no  discemable  difference 
between  the  recorded  session  and  the  original.  There  was  no  packet  loss  due  to  the  fact  that 
there  was  no  congestion  on  the  LAN  containing  the  multicasting  and  receiving  workstations. 


F.  EVALUATION  OF  RESULTS 

Currently  MvoD  only  runs  on  UNIX  systems.  For  a  user  having  little  experience 
with  UNIX  command  lines  and  environment  variables,  the  MVoD  tool  is  not  easy  to  install. 
Therefore  it  is  recommended  that  only  System  Administrators  or  experienced  UNIX  users 
install  the  software.  During  the  initial  installation,  there  were  problems  with  killing 
processes.  For  instance,  some  processes  could  not  access  sockets  even  after  prior  process  at 
the  socket  had  been  killed.  After  becoming  more  proficient  with  the  tool,  and  properly 
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shutting  it  down,  this  was  no  longer  an  issue.  Further  development  of  MvoD  may  make 
installation  simpler,  and  will  likely  provide  a  Windows  version  as  well. 

One  result  important  to  note  that  having  too  many  applications  running  on  the 
workstation  slowed  down  the  CPU  cycle  time,  effectively  slowing  down  the  compression 
rate  of  the  transmission.  In  all  of  the  playbacks  (multicasts),  the  default  transmission  rates 
on  the  audio  and  video  provided  a  clear  reproduction  of  the  original  audio/video  session. 
No  experiments  were  conducted  using  the  using  the  MBone  wb  tool.  Whiteboard  recording 
is  not  likely  to  occur  soon  due  to  the  distributed  asynchronous  nature  of  events. 

The  results  of  the  audio  and  video  testing  are  satisfactory  and  demonstrate  the 
successful  recording  and  payback  (multicast)  of  a  distance  learning  lecture  using  the  MVoD 
Service. 

G  SUMMARY 

The  results  of  this  experiment  proved  that  the  technology  exists  for  software  tools 
available  to  receive,  archive,  and  retransmit  distance  learning  lectures.  Once  set  up 
properly,  the  software  provides  a  simple  GUI  that  is  easy  to  use,  and  not  only  provides 
playback  on  demand  but  also  recording  on  demand.  Being  able  to  record  content  for  future 
use  enables  users  to  build  a  local  library  of  distance  learning  content. 

The  MVoD  tool,  or  a  similar  tool,  can  be  used  to  remotely  record  an  instructor’s 
lecture.  MvoD  could  be  set  up  to  perform  as  described  in  Chapter  V.  A  student  can  use  the 
MBone  tools  to  connect  to  the  session  during  the  live  broadcast,  or  use  the  MVoD  client 
GUI  to  receive  a  prerecorded  session  at  a  more  convenient  time.  If  bandwidth  over  the 
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network  segment  is  restricted,  which  may  often  be  the  case,  users  can  ftp  the  session  from 
the  cache  server  for  local  playback. 


95 


96 


vra.  CONCLUSIONS  AND  RECOMMENDATIONS 


A.  SUMMARY  OF  FINDINGS 

The  underlying  premise  of  this  thesis  is  that  desktop  videoconferencing  can  be 
implemented  over  the  currently  available  DISN  IP-based  networks  instead  of  dedicated 
point-to-point,  expensive,  room  based  systems  that  can  not  provide  the  scalability  necessary 
to  deliver  distance  learning  to  a  broad,  globally  dispersed  audience.  DP  multicast  is 
designed  to  scale  well  as  the  number  of  participants  and  collaborations  expand  so  that 
adding  one  more  user  doesn’t  amount  to  adding  a  corresponding  amount  of  bandwidth.  It 
doesn’t  cost  any  more  or  require  any  more  bandwidth  for  100,000  viewers  than  it  does  for 
one.  This  fits  well  with  desire  to  deliver  distance  learning  to  numerous  participants. 

Just  within  the  past  two  years  videoconferencing  technology  has  made  enormous 
strides,  and  the  current  capability  to  implement  real  time,  off-the-shelf  or  free  standards 
based  products  has  advanced  greatly  beyond  what  was  available  in  the  past.  There  are 
sufficient,  well-tested  standards  that  can  be  used  in  IP  based  videoconferencing.  Desktop 
videoconferencing  via  IP-based  networks  in  the  DII  is  a  viable  tool  that  can  add  numerous 
economical  benefits,  such  as  a  decreased  spending  for  travel  and  eliminating  the  need  to  rely 
on  large,  room-based  videoconferencing  systems. 
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B.  RECOMMENDATION  FOR  FUTURE  RESEARCH 


This  thesis  provides  a  preliminary  study  on  the  technological  and  economic  benefits 
of  implementing  IP  multicast  videoconferencing  technology  from  desktops  to  remote 
locations.  As  part  of  the  strategic  planning  process,  additional  research  is  needed  to 
determine  the  bandwidth  parameters,  such  as  latency,  delay,  on  videoconferencing 
technology  within  the  DISN.  Additional  research  is  required  in  the  areas  of: 

•  comparing  ATM  multicast  to  IP  switching  and  its  viability  in  wide-scale 
videoconferencing 

•  conduct  a  comparison  of  current  desktop  videoconferencing  software  in  its 
implementation  in  distance  learning. 

•  determine  the  feasibility  of  tunneling  over  NIPRnet. 

•  setting  up  a  course  and  delivering  its  contents  using  the  MVoD  Service  is 
another  area  of  research  that  can  provide  an  actual  demonstration  of  distance 
learning  from  the  desktop. 

•  how  network  caching  and  web  hosting  can  be  used  in  videoconferencing 

•  the  implementation  of  RS  VP  and  RTSP  over  the  NIPRnet. 
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APPENDIX  A.  GLOSSARY  OF  TERMS 


API:  Application  Programming  Interface;  the  generalized  term  for  a  defined  software 
interface  for  software  applications. 

Asynchronous  Transfer  Mode  (ATM):  A  connection-oriented  technology  defined  by 
the  rrU  and  the  ATM  Forum.  At  the  lowest  level,  ATM  sends  all  data  in  fixed  cells  with 
48  octets  of  data  plus  five  octets  of  header  information,  per  cell. 

Autonomous  System:  A  network  controlled  by  a  single  administrative  authority;  a 
routing  domain. 

Broadcast:  The  sending  of  information  from  one  to  all  hosts  in  a  LAN  network. 

Class  A:  A  t5^e  of  unicast  IP  address  that  segments  the  address  space  into  many  network 
addresses  and  few  host  addresses. 

Class  B:  A  type  of  unicast  IP  address  that  segments  the  address  space  into  a  medium 
number  of  network  and  host  addresses. 

Class  C:  A  type  of  unicast  IP  address  that  segments  the  address  space  into  many  host 
addresses  and  few  network  addresses. 

Class  D:  Multicast  IP  group  addresses. 

Connectionless:  Term  used  to  describe  data  transfer  without  the  existence  of  a  virtual 
circuit.  UDP  is  connectionless  and  provides  best  effort-  unreliable  delivery. 

CRC:  Cyclic  Redundancy  Check;  a  mechanism  to  detect  errors  in  firames. 

Ethernet:  An  industry  LAN  standard  sponsored  by  DEC,  Xerox,  and  Intel  in  the  early 
80s.  Became  the  basis  for  the  official  IEEE  802.3  LAN  standard. 


Frame:  The  link-layer  data  entity;  data  is  packaged  in  frames,  for  the  purpose  of 
transmission  over  a  network.  Frames  are  bounded  by  flag  characters  or  some  other 
delimiter. 

H.320:  An  ITU-T  umbrella  of  standards  for  videoconferencing  over  narrow-band  circuit- 
switched  WAN  services  such  as  ISDN. 

H.323:  An  extension  of  H.320,  it  covers  videoconferencing  not  only  over  narrow-band 
WAN  services,  but  also  on  packet-switched  networks,  such  as  LANs  and  the  Internet. 

H.324:  The  ITU-T’s  standard  for  real-time  multimedia  over  standard  POTS  lines  using 
28.8Kbps  V.34  modems  or  better. 
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Host:  The  generalized  term  for  any  device  that  can  be  a  source  or  sink  of  information  on 
a  network.  Generally,  a  host  is  a  single-networked  computer. 

IETF:  Internet  Engineering  Task  Force;  the  body  associated  with  the  Internet  that 
reconunends  and  approves  "standards"  for  use  on  the  Internet. 

IGMP:  Internet  Group  Management  Protocol,  the  protocol  with  which  hosts 
communicate  with  the  nearest  router  supporting  multicast  to  notify  them  about 
membership  in  a  multicast  group. 

IP:  Internet  Protocol;  the  network  layer  (layer  3)  of  TCP/IP.  Network  layer  addresses  are 
used  by  routers  for  routing  purposes. 

ITU-T:  The  Telecommunications  Standardization  Sector  of  the  International 
Telecommunications  Union,  a  body  of  the  United  Nations  which  controls  the  standards 
for  telephone  systems. 

MAC:  Media  Access  Control;  the  protocol  used  in  a  LAN  or  other  shared  transmission 
media  for  gaining  access  to  the  media. 

MBone:  Multicast  Backbone  is  a  virtual,  experimental  network  that  runs  on  top  of  the 
internet  to  provide  multicasting  of  live  video  and  audio  around  the  world. 

Multicast:  The  sending  of  information  from  one  to  many,  but  not  all  members  of  a 
network.  See  RFC  1112. 

Multicast  Group:  A  group  set  up  to  send  and  receive  messages  from  multiple  sources 
and  receivers.  These  groups  can  be  set  up  based  on  frame  relay  or  IP  in  the  TCP/IP 
protocol  suite,  as  well  as  in  other  networks. 

OSI  Model:  A  seven-layer  model  of  data  communications  protocols  standardized  by  the 
International  Standards  Organization  (ISO). 

PVC:  Permanent  Virtual  Circuit;  a  permanent  logical  connection  set  up  with  packet  data 
networks  such  as  frame  relay  or  ATM. 

RFC:  Request  for  Conament;  the  document  that  the  IETF  uses  to  define  standards  for  use 
and  recommend  practices  in  the  Internet. 

RTP  v2:  Real-Time  Transport  Protocol  Version  2  is  a  real-time  transport  protocol  that 
provides  end-to-end  delivery  of  services  to  support  applications  transmitting  real-time 
data,  for  example,  interactive  video  and  audio,  over  unicast  and  multicast  network 
services.  See  RFCs  1889  and  1890. 
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RTCP:  Real-Time  Control  Protocol  is  a  control  protocol  used  in  conjunction  with  RTP. 
RTCP  provides  information  to  applications,  identify  RTP  resources,  control  RTCP 
transmission  intervals,  and  conveys  minimal  session  control  information.  See  RFCs  1889 
and  1890. 

RSVP;  Resource  Reservation  Protocol  is  an  experimental  resource  reservation  set  up 
protocol  designed  for  an  integrated  services  network,  that  is  currently  under  development. 
An  application  might  invoke  RSVP  to  request  specific  end-to-end  QoS  for  a  data  stream. 

SVC:  Switched  Virtual  Circuit;  a  switched  logical  connection  set  up  on  a  temporary  basis 
with  packet  data  networks  such  as  frame  relay  or  ATM. 

TCP/IP:  The  protocol  suite  used  in  the  Internet.  The  most  important  protocol  suite  used 
in  networking. 

TTL  (time  to  live):  A  counter  that  is  decremented  each  time  a  packet  passes  through  a 
router. 

Unicast:  The  sending  of  information  from  one  to  one  in  a  network;  point-to-point  data 
packet  communication. 
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APPENDIX  B.  INSTRUCTION  FOR  the  BASIC  OPERATION  OF  THE 
MBone  VCR  on  DEMAND  SERVICE  (MvoD) 


This  user’s  guide  has  been  developed  from  experience  and  the  information 
provided  by  the  HTML  files  that  accompany  the  actual  the  MBone  VCR  Service 
program.  It  is  also  a  follow-on  guide  of  the  MVoD  instruction  manual  from 
Internetworking:  Economical  Storage  of  and  Retrieval  of  Digital  Audio  and  Video  for 
Distance  Learning  (Tiddy,  95).  It  is  designated  to  provide  basic  assistance  to  anyone  that 
desires  to  use  the  MBone  VCR  Service  to  record  or  playback  a  multicast  session,  and  is 
in  no  way  all  encompassing.  There  are  no  instructions  in  this  appendix  for  operating 
MBone  tools.  Information  about  the  MBone  can  be  found  at  The  MBone  Information 
Web  available  at  http://www.MBone.com/. 


A,  OVERVIEW  OF  THE  MVoD  SERVICE 

During  the  recording,  the  MVoD  Service  will  synchronize  the  data  streams  based 
upon  the  information  provided  by  the  RTPv2  protocol.  As  with  any  multicast  capable 
application,  the  MVoD  Service  does  no  need  to  know  the  source  address  of  a  data  stream 
or  the  exact  content  of  the  data  stream,  as  long  as  the  data  stream  conforms  to  the 
protocols  supported  by  the  MVoD  Service. 

A  session  recorded  by  the  MVoD  Service  can  be  one  of  as  many  as  100  multicast 
sessions  that  a  user  desires  to  record.  As  many  as  20  clients  can  access  the  server 
simultaneously. 
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To  playback  a  recorded  session,  the  MVoD  Service  RTP  data  pump  sends  the  data 
out  to  the  network,  recovering  the  original  timing  and  synchronization  of  all  the  media 
streams  included  in  this  session  and  using  the  same  network  protocols  used  by  the 
applications  from  which  the  data  was  recorded.  The  MVoD  interface  is  shown  in  Figure 


B-1. 


Figure  B-1  MBone  VCR  on  Demand  Interface  [Holfelder,  98] 
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B.  DOWNLOADING  THE  TOOLS 


The  MVoD  Service  software  can  be  downloaded  from  http://www.informatik.uni- 
mannheim.de/informatik/pi4/projects/MVoD.  The  site  contains  a  description  of  the 
service  as  well  as  a  source  for  the  various  versions  of  the  tools  (based  the  UNIX 
workstation).  The  version  described  in  this  manual  is  0.9a7,  downloaded  to  Silicon 
Graphics  Indy,  running  IRIX  6.2  OS,  128  MB  of  RAM,  running  a  MIPS  RIOOO 
processor.  It  also  ran  on  a  more  powerful  SGI  Octane  workstation.  The  workstation  that 
the  MVoD  Service  is  downloaded  to  must  have  JDK  1.1.4  or  higher  in  order  to  run  the 
client  and  server  components.  This  resource  can  be  found  at  http://www.Javasoft.com 
/nav/download. 

Once  the  tool  has  been  downloaded,  it  must  be  unzipped,  using  gunzip,  and  then 
un-tarred  using  the  tar  -xvf  command  line  to  install  it  on  the  local  workstation.  The 
readme  file  will  be  included.  It  will  provide  detailed  instructions  for  installing  and 
running  the  MVoD  service. 


C  USING  THE  MVoD  SERVICE 

The  following  sections  describe  the  basic  functions  available  to  the  users  of  the 
MVoD  Service  client,  and  assume  that  the  system  administrator  has  already  properly 
installed  the  MVoD  Server.  Additional  information  can  be  found  in  the  readme  files. 
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1. 


Connect  to  a  Server 


The  first  thing  that  a  user  needs  to  do  is  connect  to  the  desired  MVoD  server.  The 
GUI  will  list  the  servers  that  the  clients  will  be  able  to  access.  The  servers  announce 
themselves  via  the  VCR  Announcement  Protocol  (VCRAP).  From  the  list  of  servers  in 
the  left  window,  highlight  the  desired  server.  On  the  toolbar,  select  the  computer  icon, 
and  that  will  connect  the  client  to  the  server.  Then,  the  user  will  need  to  log  on  to  the 
MVoD  server. 

2.  Select  an  MBone  session 

Below  the  left  window,  select  from  the  drop  down  menu,  “SAP  announcements.” 
This  will  show  the  user  a  list  of  the  current  MBone  sessions  that  the  sdr  is  advertising  to 
the  MVoD  Server.  Highlight  the  desired  session.  Then  go  to  the  Session  drop  down 
menu  and  select  "Connect  to  session,"  or  go  up  to  the  toolbar  and  click  the  tape  icon. 
This  step  will  connect  the  user  to  the  desired  MBone  session.  At  this  point  the  RTP 
DataPump  will  create  five  files  related  to  that  particular  session  in  a  directory  called  data 
(the  location  of  directories  is  explained  in  the  readme  file).  In  the  data  directory,  you 
will  find  one  session  description  file  (*.rdcp)  for  every  session  and  two  files  (*.rec  and 
*.idx)  per  media  in  a  session  in  this  directory.  An  index  file  ends  in  .idx  and  a  data  file 
ends  in  .rec.  The  filenames  for  these  files  are  automatically  generated  out  of  the  session 
filename  and  the  corresponding  rdcp-id.  For  example,  given  that  a  session  stored  in  the 
session  file  whd-007.rdcp  consists  of  one  media  with  rdcp-id  0  and  and  one  media  with 
rdcp-id  1.  Then  the  automatically  generated  media  files  would  be:  whd-007-0.idx,  whd- 
007-0.rec,  whd-007-l.idx  and  whd-007-l.rec.  The  content  of  the  .rec  file  is  more  or  less 
the  raw  rtp-data  dumped  into  the  file  as  it  was  received  from  the  network.  The  .idx  file 
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contains  a  fixed-length  header  per  data  packet  that  holds  a  mapped  timestamp  generated 
from  information  of  the  rtp-timestamps,  an  offset  to  the  corresponding  rtp-data  packet  in 
the  .rec  file  and  a  few  other  details. 

The  user  will  also  notice  that,  once  the  connection  is  made,  the  MBone  VCR 
record  function  button,  located  on  the  lower  right  of  the  display  will  become  enabled,  and 
the  left  window  will  display  the  media  (video  and/or  audio)  associated  with  that  session. 
At  this  point  the  user  is  ready  to  record  the  session  that  he  or  she  is  connected  to. 


3.  Recording  a  Session 

Once  the  user  is  connected  to  the  session,  and  has  verified  that  the  data  is  being 
transmitted  over  the  MBone,  select  the  red  record  button.*  The  MVoD  data  files  for  the 
session  are  now  being  recorded  and  stored  in  the  data  directory.  To  stop  the  recording, 
use  the  left  mouse  button  to  click  the  square,  black  stop  function  button. 

With  most  of  the  MBone  tools,  you  can  not  record  data  that  is  sent  from  the  same 
host  where  the  RTF  DataPump  daemon  is  running  (e.g.  with  vat,  vie)  because  these  tools 
do  not  perform  so-called  local  loopback.  However,  for  playback  you  can  run  the  RTF 
DataFump  daemon  and  the  MBone  tools  on  the  same  host  since  the  RTF  DataFump  does 
not  turn  off  local  loopback. 


*  By  default  MVoD  does  not  start  to  record  if  it  does  not  receive  a  data  signal  from  any  of  the  media  in  the 
session.  To  start  recording  when  no  data  is  present,  select  the  “Recording  without  signal”  button  from  the 
Options  drop  down  menu.  Once  the  button  is  selected,  the  digital  timer  on  the  right  of  the  display  will 
activate. 
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In  other  words,  you  can  not  run  the  source  multicast  transmission  and  recording 


client  on  the  same  machine,  but  no  single-machine  restrictions  exist  during  palyback. 


4.  Editing  a  Session  or  Media 

In  order  to  edit  a  session  that  has  already  been  loaded  and  created,  the  user  must 
display  the  available  sessions.  Click  on  the  drop  down  menu  on  the  lower  left  of  the 
display  and  select  “Recorded  Sessions”.  The  left  window  will  display  the  sessions  that 
have  been  stored  (recorded).  Select  the  session  that  is  desired.  Connect  to  the  session  by 
clicking  the  tape  icon.  Once  connected  to  the  session,  the  media  types  recorded  from  the 
MBone  session  will  be  displayed  in  the  left  window. 

a.  Mute  a  Media 

A  single  click  with  the  left  mouse  button  on  the  media  list  will  select  a 
media  so  it  can  be  muted/unmuted  with  the  “mute/unmute”  icon  or  the  "mute/unmute" 
selection  under  the  Media  dropdown  list.  If  the  media  is  muted,  angle  brackets  <  > 
surrounding  it. 


5.  Play  a  Session 

To  play  a  session  back,  simply  click  the  “play”  button.  In  order  to  listen  to  and/or 
watch  the  data,  MBone  tools  vie  and  vat  need  to  be  launched.  They  can  be  launched  by 
selecting  the  ‘Tools”  dropdown  list  and  then  the  “start  MBone  tools”.  To  stop  the 
session,  click  the  stop  function  button. 
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6. 


Fast  Forward  and  Rewind 


To  fast  forward  (ff)  or  rewind  (rew)  a  session,  click  on  the  “ff’  button  or  the 
“rew”  button. 


7.  Random  Access  with  the  Session  Slider 

The  slider  on  the  lower  part  of  the  display  enables  random  access  within  the 
session.  Clicking  with  the  middle  button  somewhere  in  the  slider  will  forward  or  rewind 
the  session  to  this  point.  Clicking  the  left  mouse  button  on  the  slider  to  the  left  of  the 
marker  will  rewind  the  session  about  one  minute.  Clicking  on  the  left  mouse  button  to 
the  right  of  the  marker  will  forward  the  session  about  one  minute.  At  the  lower  right 
comer  of  the  display,  the  total  length  of  the  session  is  displayed. 


8.  Loop  Mode 

If  the  “Loop  Mode”  entry  in  the  “Options”  drop  down  list  is  selected  during 
playback,  the  session  will  start  all  over  from  the  beginning  when  it  reaches  the  end.  This 
feature  allows  continuous  transmissions. 
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9. 


Quick'Keys 


The  following  quick-keys  are  available  for  the  MVoD  Client.  They  will  probably 
continue  to  change  as  the  product  matures. 


Key 

Meaning 

q 

quit 

backspace 

go  back  one  level 

P 

play  .  __ 

shifl-p 

play  at 

s 

stop 

shift-s 

stop  at 

r 

record 

shift-r 

record  at 

e 

edit  session 

i 

info  about  session 

t 

start  tools 

shift-t 

start  all  tools 
automatically 

1 

loop-mode  on 

shift-1 

loop-mode  off 

right 

forward  one  second 

shift-right 

forward  10  second 

Ctrl-right 

forward  1  minute 

left 

back  one  second 

shift-left 

back  10  second 

Ctrl-left 

back  1  minute 

up 

goto  end 

Down 

goto  start 

D.  KNOWN  BUGS  and  SHORTFALLS 

This  demonstration  used  Version  0.9a7,  downloaded  to  a  Silicon  Graphic  Indy, 
running  IRIX  6.2  OS,  128  MB  of  RAM,  running  a  MBPS  RIOOO  processor.  This  version 
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of  the  MVoD  service  is  more  user  friendly,  because  it  uses  the  GUI  interface  to  alleviate 
many  of  the  manual  inputs  required  in  the  (Tiddy,  96)  instruction  guide.  This  section 
delineates  the  known  bugs  and  some  of  the  shortcomings  of  the  MVoD  Service. 

•  The  MVoD  versions  for  the  SUN  workstation  could  not  be  xmtarred.  A 
checksum  error  was  displayed. 

•  Whiteboard  (wb)  is  not  yet  supported. 


F.  SUMMARY 

Once  installed,  the  MvoD  tool  is  easy  to  operate.  The  GUI  is  user  friendly,  and 
provides  context  help  in  the  status  line,  depending  on  the  state  of  the  client  and  the  area 
the  mouse  pointer.  Although  not  all  encompassing,  this  instruction  manual  can  aid  a  new 
user  with  simple  operation  of  the  MVoD  client. 


Ill 
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